CS Blog

Last update on .

Professor Jeff Huang And 19 Students Crowdsource A Dataset Of 2200 Computer Science Faculty

Professor Jeff Huang and 19 students enrolled in his CS2951-L seminar on Human-Computer Interaction released the first free dataset with information on the academic development of all professors in 50 top US Computer Science graduate programs.

As part of an assignment, the students were given a fixed budget in Amazon Mechanical Turk credit and were asked to come up with crowdsourcing strategies that would help them gather a full record of all faculty in subsets of Computer Science departments.

After 16 days, PhD student Alexandra Papoutsaki led a team of users to merge and clean the data gathered by the crowdworkers, yielding a valuable academic resource on the Computer Science landscape.

"This data could be useful," Huang said, "for identifying hiring trends in computer science research areas, compute relative rankings for Bachelors, Masters or PhD programs in terms of placing students into faculty jobs, or for students applying for PhD programs to identify universities with faculty in their area of interest. We’re releasing it as a publicly editable resource so others can benefit, and we’ve already heard from hundreds of people who have used the data to make departmental decisions, to inform a conversation with their provost, and evaluate how their department compares to others. So far there have been more than 21,000 people who viewed our generated information in the short period of just a week. There have been already 450 contributions and corrections and this number keeps growing everyday. We are really excited to see the academic community embracing our dataset in such a way."

You can read more about the experiment here, and you're encouraged to download the latest version of the dataset and contribute to any missing or incorrect entries.