Wednesday, June 29 • 11:00am - 11:05am
Text Mining and Sentiment Extraction in Central Bank Documents

The deep transformation induced by the World Wide Web (WWW) revolution has thoroughly impactedna relevant part of the social interactions in our present global society. The huge amount ofnunstructured information available on blogs, forum and public institution web sites puts forward differentnchallenges and opportunities. Starting from these considerations, in this paper we pursue a two-foldngoal. Firstly we review some of the main methodologies employed in text mining and for the extractionnof sentiment and emotions from textual sources. Secondly we provide an empirical application by consideringnthe latest 20 issues of the Bank of Italy Governor’s concluding remarks from 1996 to 2015. Byntaking advantage of the open source software package R, we show the following:n1. checking the word frequency distribution features of the documents;n2. extracting the evolution of the sentiment and the polarity orientation in the texts;n3. evaluating the evolution of an index for the readability and the formality level of the texts;n4. attempting to measure the popularity gained from the documents in the web.nThe results of the empirical analysis show the feasibility in extracting the main topics from the consideredncorpus. Moreover it is shown how to check for positive and negative terms in order to gauge thenpolarity of statements and whole documents. The R employed packages have proved suitablenand comprehensive for the required tasks. Improvements in the documentation and the package arrangement are suggested for increasing the usability.

Giuseppe Bruno

Bank of Italy

