CS Blog

Last update on .

Data is the new soil of business and (soon) at the core of essentially all domains from material science to healthcare. The possibilities are sheer endless. In 2007 the Wired Magazinegot even so far and predicted "The End of Science: The quest of knowledge used to begin with grand theories. Now it begins with massive amounts of data." Just as a showcase: researchers within the Archaeology and Anthropology department at the University of Bristol used n-grams created by the Google Books project to analyze how emotions have been expressed over centuries. The results clearly show the effect of World War II as well as the "happy" times of the 60s, whereas surprisingly World War I had no measurable effect (see the figure below).

Emotions over time

Even though just an example, it demonstrate how data is transforming research. But it doesn't stop there. In a similar manner, companies are impacted by big data. For instance, Google translate or Netflix's streaming offering wouldn't be possible without large amounts of data and the techniques to transform them into actionable knowledge. 

In this Spring 2014 I (Tim Kraska) will offer a new course called Introduction to Data Science. Data Science refers to the techniques and processing pipeline involved to extract knowledge from data, allowing to unfold the potential of big data. This includes a wide variety of tools and techniques such as data storage, large scale processing infrastructure, databases, data cleaning, statistics, machine learning, visualization, among others. Obviously, we cannot cover it all. In fact, each of the mentioned disciplines could and are actually offered as individual courses. Instead this course is about the data processing pipeline in its entirety and how the different components actually work together. It will provide an overview about all the techniques involved in data science with a stronger focus on the tools rather than statistical algorithms to extract information out of data. Students will learn hands-on as part of their assignments how to apply the techniques in order to extract knowledge out of data. 

If you like to learn more, visit our web-page: http://cs.brown.edu/courses/csci1951-a/