Lecture: Visualizing and Measuring Social Data, Jan. 29 at 3:00pm

Visualizing and Measuring Social Data
Matt Taddy
Associate Professor, Econometrics and Statistics, University of Chicago Booth School

Wednesday, January 29, 2014, 3:00pm-4:30 pm
Kathleen A. Zar Room, John Crerar Library

Biography:
Matt Taddy is an associate professor of econometrics and Statistics and Neubauer Family Faculty Fellow at the University of Chicago Booth School of Business. His research is focused on statistical methodology and data mining, driven by applications in business and engineering. He developed and teaches the MBA ‘Data Mining’ course at Chicago Booth.

Taddy works on building robust solutions for large scale data analysis problems. This involves dimension reduction techniques for massive datasets and development of models for inference on the output of these algorithms. Applications are ongoing in consumer database mining, digital marketing, analysis and optimization of computer simulators, and in text mining for analysis of social media, financial news, and political speech. He has collaborated both with small start-ups and with large research agencies including NASA Ames, Lawrence Livermore, Sandia, and Los Alamos National Laboratories.

Taddy earned his PhD in applied Math and Statistics in 2008 from the University of California, Santa Cruz, as well as a BA in Philosophy and Mathematics and in Mathematical Statistics from McGill University. He joined the University of Chicago Booth faculty in 2008 

Abstract:
This presentation will explore ‘Big Data’ problems: inference from unstructured data that is too large to analyze, or even store, on a single computer. In social science, this will typically involve some amount of text analysis, even if just to extract variables of interest from the original data. An example application could involve the internet browsing history for a number of individuals, their physical location, a corpus of text associated with these individuals (both from the web pages they visit, and text generated on social media), and, say, purchases by these individuals. Or, we may wish to understand the relationship between a number of economic or political variables and the co-movement of topics, terms, and tone in the news and on social media. 

In such text mining applications there is an interplay between two visualization strategies: plotting predictions and factors that summarize the information you have mined from the text, and looking at the role played by individual phrases or words in the model driving this summarization. Getting these two modes of visualization to work together is key to communicating and understanding results. In this talk, Dr. Taddy will cover the basic idea of how the statistical models behind the analysis work, and use this to understand what one might want to be plotting. The main goal is then to illustrate how the two modes of visualization work together. This will be shown through example applications including financial news, political speech, and social media. 

*Cookies and Refreshments will be served*