|
|
The Reuters NewsScope Event Indicies. AlphaSimplex Group.
Abstract: The Reuters NewsScope Event Indices Project is an integrated framework for incorporating real-time news from the Reuters NewsScope subscription service into systematic investment and risk-management protocols. The framework consists of a set of real-time event indices— each one taking on numerical values between 0 and 100—designed to capture the occurrence of unusual events of a particular kind. For example, the Macro index measures the real-time quantity of macroeconomic news, and the NatDist index measures the real-time quantity of natural-disaster news. Each index is constructed by applying disciplined pattern-recognition algorithms to real-time newsfeeds, and calibrated using econometric methods applied to historical data. In this first release, we construct indices that are calibrated to foreign exchange markets, and future releases will focus on other markets. In this paper, we describe the procedures for constructing and validating the Reuters/ AlphaSimplex Event Indices. Section 2 introduces the historical data sets used to calibrate the indices. Section 3 contains the algorithms used to construct the indices. In Section 4, we describe the event-study methodology for validating the indices, and in Section 5 we explore the connection between realized volatility (our metric for market impact) and implied volatility. We conclude in Section 7.
Keywords: TextMining
|
|
|
|
2008. The Emerging Role of News Feeds for Algorithmic Trading & Advanced Decision Support. Thomson Reuters.
Abstract: Is news “new?” In a sense, yes. Being first to take advantage of a piece of news has always been essential to traders and investors. Today, however, the overwhelming volume and ever-shrinking time available to react to it profitably has posed mounting pressure on trading desks, portfolio managers and analysts to cope with these challenges. Fortunately, technology is making it feasible to handle more news in less time in ways that were barely conceivable a few years ago. Newly available technologies and techniques are giving traders and investors innovative ways to use automation (such as complex event processing engines) to allow them to incorporate news into their strategies. These enable them to exploit market opportunities and inefficiencies driven by news events and it helps them reduce their exposure to event risks.
Keywords: TextMining
|
|
|
|
Aggarwal, C.C., Li, Y., Wang, J. & Wang, J., 2009. Frequent Pattern Mining with Uncertain Data. Paris, France: ACM Press.
Abstract: This paper studies the problem of frequent pattern mining with uncertain data. We will show how broad classes of algorithms can be extended to the uncertain data setting. In particular, we will study candidate generate-and-test algorithms, hyper-structure algorithms and pattern growth based algorithms. One of our insightful observations is that the experimental behavior of different classes of algorithms is very different in the uncertain case as compared to the deterministic case. In particular, the hyper-structure and the candidate generate-and-test algorithms perform much better than tree-based algorithms. This counter-intuitive behavior is an important observation from the perspective of algorithm design of the uncertain variation of the problem. We will test the approach on a number of real and synthetic data sets, and show the effectiveness of two of our approaches over competitive techniques.
Keywords: TextMining
|
|
|
|
Ananiadou, S., 2008. National Centre for Text Mining: An introduction to tools for researchers.
Abstract: With an overwhelming amount of knowledge recorded in texts, it has become imperative to use automated techniques that can identify, extract, manage, integrate and exploit this knowledge for research and education, efficiently and systematically. Text mining exploits these techniques. The National Centre for Text Mining (NaCTeM) offers text mining services to UK researchers that enable semantic searching of text – that is, searches based on the meanings of words, phrases or terms in different contexts – thus improving access to information and increasing the efficiency of new research methodologies and techniques based on advanced information and communication technologies (e-science and e-research).
Keywords: TextMining
|
|
|
|
Ananiadou, S., 2008. Text Mining.
Abstract: An alert reader will make connections between seemingly unrelated facts to generate new ideas or hypotheses. However, the burgeoning of published text means that even the most avid reader cannot hope to keep up with all the reading in a field, let alone adjacent fields. Nuggets of insight or new knowledge are at risk of languishing undiscovered in the literature. Text mining offers a solution to this problem by replacing or supplementing the human reader with automatic systems undeterred by the text explosion. It involves analysing a large collection of documents to discover previously unknown information. The information might be relationships or patterns that are buried in the document collection and which would otherwise be extremely difficult, if not impossible, to discover. Text mining can be used to analyse natural language documents about any subject, although much of the interest at present is coming from the biological sciences.
Keywords: TextMining
|
|
|
|
Antweiler, W. & Frank, M.Z., 2001. Is All That Talk Just Noise? The Information Content of Internet Stock Message Boards.
Keywords: FinancialRatios; TextMining
|
|
|
|
Crammer, K., Dredze, M. & Kulesza, A.. Multi-Class Confidence Weighted Algorithms.
Abstract: The recently introduced online confidence-weighted (CW) learning algorithm for binary classification per- forms well on many binary NLP tasks. However, for multi-class problems CW learning updates and inference cannot be computed analytically or solved as convex optimization problems as they are in the binary case. We derive learning algorithms for the multi-class CW setting and provide extensive evaluation using nine NLP datasets, including three derived from the recently released New York Times corpus. Our best algorithm out- performs state-of-the-art online and batch methods on eight of the nine tasks. We also show that the confidence information maintained during learning yields useful probabilistic information at test time.
Keywords: TextMining
|
|
|
|
Das, S.R. & Chen, M.Y., 2006. Yahoo! for Amazon: Sentiment Extraction from Small Talk on the Web.
Abstract: Extracting sentiment from text is a hard semantic problem. We develop a methodology for extracting small investor sentiment from stock message boards. The algorithm comprises different classifier algorithms coupled together by a voting scheme. Accuracy levels are similar to widely used Bayes classifiers, but false positives are lower and sentiment accuracy higher. Time series and cross-sectional aggregation of message information improves the quality of the resultant sentiment index, particularly in the presence of slang and ambiguity. Empirical applications evidence a relationship with stock values – aggregate tech sector sentiment is found to predict stock index levels, but not at the individual stock level. The algorithms may be used to assess the impact on investor opinion of management announcements, press releases, third-party news, and regulatory changes.
Keywords: FinancialRatios; TextMining
|
|