Government Technology

Twitter Addresses Data Analysis Skill Shortage with UC Berkeley Class


December 18, 2012 By

An elective class arms students with big data analysis skills that are in high demand.  

In general, people with big data analysis skills and real experience are hard to find, said Gilad Mishne, engineering manager of search at Twitter. And competition between companies for these skilled people is hot, especially in the Silicon Valley.

Because users send 400 million tweets a day, Twitter needs complex algorithms to analyze all that data. And it needs skilled engineers to do the job. So it partnered with UC Berkeley on a class that teaches students how to analyze data using real tweets.

The "Analyzing Big Data with Twitter" class included lectures from UC Berkeley professors and 15 volunteers from Twitter on the technology behind the social network. These lectures were video taped and are available online for anyone to watch.

Along with the lecturers, 12 volunteer mentors from Twitter worked with 40 undergraduate and graduate students as they built data-driven applications. For example, one team used tweets to find funky restaurants around campus. This application was something they could use the next day, and both the engineer supervising the team and the students had fun working on it.

"This is priceless," Mishne said. "The first thing I actually look at when I see a CV (curriculum vitae) is, 'Did this student do something real? Did he build something or did she build something with real data?' And this is exactly what I would look for — this kind of experience."

In one project, students analyzed Twitter interest graphs (who links to whom) and conversation graphs (who refers to whom), said Marti A. Hearst, professor in the UC Berkeley School of Information. Students made interesting visualizations for this assignment through simple graphing algorithms that showed hundreds of thousands of interests that people discussed online.

"To me, what was interesting was how much you can see about the different topics that do arise from who links to whom, especially if you're looking at more well-known people, not necessarily celebrities, but people who have a lot of influence in the twitter sphere," Hearst said.

These students could use their newly acquired skills in applications including public health, business and city planning. As more types of data and real time data are collected, analysts can see accurate trends in the spread of disease. They can understand what customers think about a product and trigger a response. And they can see where new fire stations or social services should be located based on where people are living and moving to.

But more students need to be trained to do this kind of work. And to help train these students, Mishne said he would like to repeat a class like this with UC Berkeley and other places that express interest.

This story was originally published at the Center for Digital Education website.

Photo from Shutterstock




You may use or reference this story with attribution and a link to
http://www.govtech.com/education/Twitter-Addresses-Data-Analysis-Skill-Shortage-with-UC-Berkeley-Class.html


| More

Comments

Don Turnblade    |    Commented December 19, 2012

Strong large data set analytical skills are not a difficult to find as some might guess. Rather, IT trending has made a buzz phrase that actually complicates the matter. For example, a well trained Six Sigma Blackbelt with Design of Experiement and Hypothesis Testing skills ets can do Big Data Set Analystics. Any Aerospace Engineer with Rocket Telemetry Analysis skills can do it. Any Medical or Insurance Actuarial trained staff can do it. These grups simply do not call statistical, hypotheis and trend analysis skills, "Big Data" analytical skills. The problem is an artificact of Jargon.

Sally Maki    |    Commented December 31, 2012

I have seen the same aversion to the "big data" buzz from other big names in analytics. I agree that there are a lot of other strong skill sets that provide a good foundation, but I also think the new challenges shouldn't be underestimated. My understanding is that the term Big Data isn't just about statistics on large, static (or frozen), or designed data sets. I think it is intended to bring attention to the specific challenges associated with analyzing what's currently happening with data sets that are growing and changing at an unprecedented rate. Most query training I've seen doesn't explain why one query is faster than another, for example - but that difference changes what's possible with big data. There are also challenges associated with working with emergent, opportunistic and unstructured data sets (especially comment text), rather than data coming from designed experiments. Even rocket scientists haven't had to analyze 400 million new data points a day that say things like "NYE we streaming live on http://www.xumanii.com at 10:30 CST!!!! https://socialcam.com/videos/ea7ke1Vu?type=email … #CHAPTERVTOUR".


Add Your Comment

You are solely responsible for the content of your comments. We reserve the right to remove comments that are considered profane, vulgar, obscene, factually inaccurate, off-topic, or considered a personal attack.


Collaboration for the Public Sector



Collaborative Justice: Transforming Criminal Justice Services Through Unified Collaboration
This issue brief examines video collaboration in every stage of the human justice process, demonstrating how this technology can not only make services more efficient, affordable, and accessible.

Cloud-Based Services Accelerate Public Sector Adoption of Video Collaboration
Today, thanks to new cloud technologies and high-quality networks, mobile video services - which provide not only cost savings but which help governmental interactions become more efficient - are more feasible than ever before.

Modernization as a Service: Acquiring IT through Innovative Procurement

Five Ways Collaboration is Driving Government Performance

Mobile Video Collaboration: The New Business Reality