Each day thousands of news articles are published by large news agencies. Reuters alone publishes over two million articles each year. On any given day, the world looks very confusing based on the news flow. Some companies are optimistic and increasing their investment, while some firms are going bankrupt. Even for a professional analyst who has time and resources to devote for making sense of the economy, the job is overwhelming. Kontoor is a cloud based neural network that automatically reads in in real time each published news article published and creates an un-biased analysis of the news. The system stores all economic and corporate data from the news and makes sure all relevant information is captured from the news.
A news dashboard view and a predictive model developed with Kontoor and WhereOS
The challenge of automated data collection has traditionally been related to the fact that English language is extremely rich in vocabulary and expressions. Each event or phenomenon can be expressed in dozens of different ways, which makes traditional search engines ineffective. “Many platforms are capable of detecting synonyms for nouns and adjectives. Kontoor is the only platform capable of understanding the semantics for the verbs as well. Phrasal verbs are essential on English language. For instance, think of dozens of expressions there are for something increasing?”, says Jukka Taskinen, CEO of Kontoor.
“With Kontoor and WhereOS technologies, we’re able to engage the power of artificial intelligence or more precise advanced semantic neural networks extracting important business information from the news, and train machine learning algorithms e.g. to predict commodity price movements for different industry sectors”, says JP Partanen, CEO of WhereOS.
The collaboration between WhereOS and Kontoor leverages the strengths of each company: Kontoor’s ability to turn unstructured news into quantitative information, WhereOS’ abilities to integrate and operationalize multiple data source and develop machine learning models, API and applications on top of the underlying infrastructure. “We have been extremely satisfied with WhereOS, as it has helped us to speed up our development of APIs and UIs based on our data with the factor or 5x”, says Jukka Taskinen, CEO of Kontoor. “It has been great to work with the truly talented team of Kontoor, and see how powerful Kontoor semantic artificial intelligence technology can really be in understanding and documenting the news feeds”, says JP Partanen, CEO of WhereOS.
Extracting Information from Unstructured News Text with Kontoor
When we are reading the news, we are looking for “events” – something that is taking place. This allows us to form an idea of the events that are shaping the direction of the world. Kontoor mimics this by creating generic events for companies and markets. Generic events cover any common activities by companies like increasing investment, hiring people, nominating a CEO, adjusting dividend or legal penalties. Also, events such as floods, strikes, and weather can be identified that have an impact on daily business. Events can be identified for companies, products and countries. Further, Kontoor identifies time periods, people and numeric data that will help in analysis.
During any news day, there is an endless stream of small stories about the companies you have never heard of and hence these stories are dismissed as insignificant. In other words, you are losing almost 100 percent of information in the news. All stories carry small pieces of data about how the business is performing and how they see the future. For instance, a story about Beiersdorf says they have increased sales and market share in their 3Q earning release.
“Beiersdorf, the maker of Nivea skin creams, said it snatched market share away from rivals as it posted a 6 percent rise in organic group sales in the first nine months.”
It is very difficult and labor intensive to read all the news and store it in a meaningful manner to be used later. Kontoor, on the other hand, is able to automatically collect and process such data about all events related for example related to selected corporations, and create a fully documented history for each firm. You may not be interested in this company, but aggregating events data for the industry, it is possible to start understanding the big picture of the industry or product segment – trends become visible. By aggregating data for all industries, you will start seeing the trends in economy before the statistical organizations release data with a lag. Industry data can be divided into industry and services to predict PMI, industrial production and other similar indicators.
Kontoor Understands the Difference Between the Past and the Future
There is a significant difference in terms of economic value between the past events and the expected future events; Kontoor is specialized in understanding the difference of past and the future when processing the news. This enables Kontoor to gather automatically expectations related to commodity production changes, for instance, from the thousands of articles. Similarly, expectations for companies and markets can be gathered from the news. Therefore it is possible to find out a synthesis of opinions from the industry experts and companies
Let’s have a look at some practical examples: The airline company SAS expects its profits to improve as it reported 3Q earnings – a past event. The steelmaker Thyssenkrupp sees its profitability to suffer. By gathering similar data for the industry, we can see whether these events are company specific or an indicator of an industry trend.
“Scandinavian airline SAS hiked its full-year earnings outlook on Friday as third-quarter profits topped market expectations.“
“Thyssenkrupp (TKAG.DE) cut the profit margin forecast for its capital goods business on Tuesday.”
Kontoor detects similarly expectations for sales, order, production, investment and so forth for any imaginable economic and corporate event.
Kontoor Semantic Artificial Intelligence Technology
Kontoor’s semantic deep neural network is trained by thousands and thousands of examples collected from the news; examples for each expression for corporate and economic events. Based on the training material, network is being build up of multidimensional matrix of words that create clusters for each event we want to detect. Learning in the network happens as new expressions arrive that are close enough to some existing cluster. Network predicts the meaning of a new expression its proximity to exiting expressions.
Kontoor platform consist of core Kontoor technology and infrastructure around it – the infrastructure is similar to any search engine with artificial intelligence on top. Artificial intelligence consists of IBM Watson style, first generation component and a semantic neural network. Infrastructure is similar to any search engine to store indexed news stories.
Core technology is based on Stanford NLP library for natural language processing and different proprietary tools, models and databases, including machine learning model for sentiment analysis and deep neural network based models for events extraction. Training material for AI models constructed from huge unstructured news datasets, thousands of handcrafted rules and data engineering.
Infrastructure built around Kontoor technology consists of different Hadoop ecosystem technologies. Data storage organized with HDFS (Hadoop Distributed File System) and HBase. Data indexing and search features implemented with multi-server Solr Cloud installation and Cloudera HBase-Solr indexers. News tracking/fetching done by our own tracking module, streaming processing with Spark Streaming and Kafka messaging system. All together those technologies allow Kontoor fetch new data, process it, store and make results available with a minimal time delay. Kontoor platform is available via Play framework and ReactJS based user interface and Akka based API.
Creating Predictive Model and Application UI with WhereOS
WhereOS connects to Kontoor backend to receive a stream of quantitative data related to news (events, locations, time, sentiment, etc.). WhereOS can be used to create a layer of functionality on top of the data, by implemented different data processing pipelines in the WhereOS backend:
- Data APIs – The processed news data can be made available through APIs, and any new APIs create be dynamically created within minutes
- Combining the news data with other data sources – The quantitative data extracted from the news sources and be combined with other data sources, such as commodity price index information.
- Machine learning models – Data from the news can be used to train machine learning models (e.g. XGBoost in this case) against for example the commodity price index information (or any other quantitative data) and create ML model that can predict the e.g. the price index value based on processed news information feed received from Kontoor API.
- Application UI – Data can be easily visualised into different type of user interface applications for various purposes.
WhereOS uses Spark and Hive as the execution engine for the pipelines, which means the data processing is happening always as distributed computing, and can scale to billions of data points. WhereOS pipelines can be created using SQL and R programming languages including built-in functions for ETL (extract, transform, load) operations, statistical analysis, machine learning and artificial intelligence (AI), geographical and geospatial analysis, data visualization etc, which makes it really easy and simple to implement different data processing actions.