A computer in this day and age can perform remarkable tasks such as understanding human language, its surroundings, its environment and more. Even more remarkable is that achievement was made only in the last few decades. But how does it work? We Data Scientists at Sentifi believe we are qualified to answer that question because our daily task includes making computers and machines read documents, see images and understand them all.
At Sentifi, we work with a large amount of text data everyday that comes from various sources including Twitter, newspapers and blogs. People write about all sorts of topics, football, fashion, the stock market, politics, food recipes, the Kardashians, to name a few. We are interested in what the crowd is tweeting, what journalists are publishing and what bloggers are posting. However, no human being can read through those millions of messages and articles that populate the internet each day. Thus, we use computers to read all those articles and tweets, then extract the crucial information we are searching for.
The task of the Data Scientists at Sentifi is to develop models that make machines capable of reading and understanding what the crowd is discussing. To accomplish that, we have to enter the realm of Artificial Intelligence and Natural Language Processing. We use machine learning algorithms to make computers “intelligent.” It is based on mathematical frameworks like statistics, linear algebra and optimization, so basically all those things that you slept through in high school.
We also use such machine learning algorithms to make computers decide whether a message contains financially relevant information. We want messages mentioning the stock market, politics and environmental catastrophes, i.e. things that move stock prices. On average, we capture over 20 million messages everyday. Now the question is: How can we structure those messages?
It is done by giving the algorithms examples of how financially relevant messages look like as well as the ones that are not financially relevant. The algorithms find patterns in both types of messages, memorize them, then use them to search for the financially relevant messages in the aforementioned 20 million messages.
To find and learn the patterns in the messages, our algorithms take several things into account — words, grammar, the frequency of each word and its relationships with other words in the same message. It is actually rather tricky due to some challenges, be it spelling mistakes, slang words, word disambiguation or even neologisms.
A human’s mind can easily fill in the gap and assign meaning to ambiguity, whereas computers still have a hard time doing that. When you see the word “Apple,” what pops into your head? Do you think about the fruit or the company that makes you pay hundreds of dollars for a smartphone? Context would make it easier to understand, however a computer, even with context, has trouble figuring out whether “Apple” is the fruit or the company. Thus, we’re working hard to come up with methods that make computers smarter and capable of figuring out such cases.
As you can see, we work with exciting and new technology at Sentifi, so if you are keen on experimenting with state-of-the-art machine learning algorithms for Natural Language Processing and have interest in the financial markets, come join us. We have plenty of tasks ahead of us, and we need smart people who can help us complete them.
Maximilian Unfried is a Data Scientist at Sentifi where he works with machine learning, deep learning and natural language processing. He holds a Masters Degree in Complex Adaptive Systems from Chalmers University of Technology, in which he specialized in Artificial Intelligence and Machine Learning. Furthermore, he spent time as a guest researcher in the Machine Learning Lab at the National Chiao Tung University in Taiwan. His main interests are applied machine learning for natural language understanding and computer vision.
Sign up for a free Sentifi account to take advantage of financial insights from the crowd to maximize your investment strategy.