Big data. The term is buzzing through the industry like wildfire, with vendors popping up in droves claiming their latest solution will help an agency increase efficiency. The concept seems simple enough — technology grabs data sets from a variety of systems and kicks back usage trends and other patterns that government leaders can use to help make better decisions.
But with so many products in the marketplace, getting a handle on all the options available can be a headache for even the savviest CIO. To help cut through the confusion, Government Technology took a look at 10 big data solution providers. Whether your data resides on an open source framework such as Apache Hadoop or a proprietary database system, the following list — presented in alphabetical order — offers a snapshot of the types of big data storage and analytical technology available today.
What IT does: CommVault’s Simpana Platform lets users analyze, back up, recover, replicate, archive and search data across their enterprise and across any storage device, according to the company. The platform includes Simpana OnePass, which integrates archiving, backup and reporting into a single process to eliminate operational complexity and reduce cost. The products are designed to work on large-scale petabyte-level file systems and Microsoft Exchange messaging environments.
How it’s different: Emily Wojcik, CommVault’s senior manager of product marketing, said the technology reduces scan times because backup, archiving and reporting are performed as a single operation, improving efficiency. “What makes it very different from other vendors out there is that we’ve built this whole platform for data and information from the ground up,” Wojcik added. “No acquisitions.”
Reference customer: The Afognak Native Corp., a quasi-public organization formed in 1971 to conduct business on behalf of Alaska’s indigenous people, is a Simpana user. The corporation needed simpler, faster searching and retrieval of data to meet legal discovery requests. In addition, it wanted to improve its disaster recovery.
Like this story? If so, subscribe to Government Technology's daily newsletter.
Using Simpana 9 software from CommVault, the corporation can now recover backed-up data within 75 minutes, and has an improved e-discovery system that integrates with Microsoft’s Windows Azure cloud platform.
What it does: EMC bills its Isilon platform as highly scalable storage for the big data era. Isilon is a storage and management solution for file-based, unstructured data such as audio content, video footage, large home directories, massive log files and analytical data in general. Capacity can expand from a few terabytes to 20 petabytes depending on need, the company says.
How it’s different: Audie Hittle, federal CTO of EMC Isilon, said scalability is Isilon’s calling card for public-sector customers. Hittle said Isilon’s architecture lets customers add capacity without rebuilding or replacing systems.
Reference customer: Although Hittle couldn’t name Isilon’s premier public-sector customer, he described it as an “intelligence wing of a federal agency.” He says the organization used Isilon to consolidate data storage equipment from 19 racks to three, and reduce the need for support staff.
What it does: IBM’s Smarter Planet initiative offers an array of big data solutions for public safety, transportation, social services programs, tax and revenue, and education. These products often include advanced case management and predictive analytics modeling capabilities.
How it’s different: IBM takes a program-specific approach to big data in the public sector. For example, the company works with state unemployment insurance programs to improve claims handling and automate processing of routine claims, leaving only complex matters for case adjudicators.
The company is also focused on how big data can be applied in K-12 education.
“We’re involved in a number of initiatives that really are taking advantage of the explosion of digital content and its relationship to the classroom,” said Gregory Greben, vice president of public-sector business analytics and optimization practice at IBM Global Business Services.
Reference customer: Police in Fort Lauderdale, Fla., use IBM technology to mash together traditional criminal justice data and information from other city departments to gain new insights on criminal activity. New analysis tools will let the city police department comb through traffic and transportation information, building permits and social media activity in addition to standard criminal justice databases. Correlating these diverse data sets could help the department anticipate where crimes will occur and put cops in the right places to stop them.
What it does: Informatica says its PowerCenter Enterprise product provides a platform for data integration initiatives like data governance, data migration and enterprise data warehousing. It scales to support large volumes of disparate data sources, the company says, turning raw data into actionable information.
How it’s different: PowerCenter cuts development and deployment time by letting users integrate their own data in a shared graphical environment. The product is designed to help organizations take advantage of big data without requiring knowledge of specialized programming languages or frameworks, the company says.
“The data scientist can spend more time doing analytics and science, and they can turn over the more mundane tasks of pipelining the data in … to somebody who knows data, but doesn’t necessarily know Hadoop,” said Todd Goldman, the company’s vice president and general manager of enterprise data integration and data quality. PowerCenter also can automatically clean up “dirty” data produced by RFID sensors and other sources, he added.
Reference customers: Informatica works with a variety of public-sector agencies, most notably the state of Colorado and the IRS. The IRS uses Informatica software to convert data from multiple legacy formats into useful information. Colorado is analyzing student data and human services information to predict student success, and to direct students to appropriate support programs.