Machine Learning: Clustering & Sentiment Analysis
Last Thursday I had the pleasure of talking at “Nerd Interface”, a meetup sponsored by my employee that covers many exciting topics like Virtual Reality, IoT (Internet of Things), web/mobile development, and user experience.
Here’s the synopsis of the talk:
Have you ever wondered how does Facebook know if you are “Liberal”, “Conservative”, or “Moderate” without explicitly asking you?
How Amazon detects fake reviews of its products?
How do the “Twitter Funds” decide what stocks to buy or sell?
How can companies identify detractors and promoters?
In the age of information overload, automated tools are the only way of keeping up with deluge of data generated every second. Python’s Natural Language Toolkit (NLTK) is one such tool. Natural Language Processing (NLP) and machine learning allows algorithms to extract useful and insightful information from free form text. During this presentation we’ll see a live demonstration of the sentiment analysis functionality provided by NLTK, and how it computationally identifies and categorizes opinions straight from one of main content sources of our era: Twitter. We’ll also examine clustering, one of the most common forms of unsupervised learning. Clustering allows us to process large quantities of texts and group similar texts together, without user intervention.
The event was broadcast through Facebook Live, and you can see the recording here:
The presentation is available here.
You can also find the source code here. I encourage everyone to take a look at the code, particularly the two Python files. It’s very concise and easy the understand. If you have any questions, don’t hesitate to ask.