Big data is more than just a buzzword in an exploding technology market — it impacts many facets of our everyday lives. In fact, it is delivering substantial value in health care, and may be applied to help solve today’s most recent world health crisis: the Zika virus.
For patients with Zika, the symptoms are mild and similar to a few other mosquito-borne viral infections. This virus, however, has been linked to microcephaly, an incurable birth defect that can be passed prenatally from mother to child. Zika also appears to be spreading rapidly. Though today it is endemic to a minority of continents and countries worldwide, the virus' reach grows each and every day. It is speculated that more than 3,000 children have neurodevelopmental disorders linked to Zika.
Health-care and life sciences practitioners are racing to understand the virus, as infection spreads via world travelers who are unaware that they are carriers. In the effort to unravel Zika, there are many goals: Understand the disease, create a vaccine, create tests to identify patients who have Zika, stop the spread of Zika and find ways to cure microcephaly.
So how can big data support the fight against Zika?
The real question might be the converse: Where isn’t big data helping, or where can’t it be applied? At both the patient and population levels, big data helps us understand who suffers from epidemics, who will have experience an epidemic, what causes them, how they spread and how to treat them. Big data also can help to prevent future spread by, for example, determining how to get vaccines to the right places before they're needed.
Digging a bit deeper, one of big data's most common applications in health care is in signal detection and surveillance at patient intake. Health systems today often use big data technologies like Apache Hadoop to ingest real-time streams of data from bedside monitors and machines, wearables, and other sources, and combine that with electronic health record (EHR) data using the Health Level Seven (HL7) standard. Legacy technologies couldn’t accommodate the variety of data types or keep pace with the speed at which it flows from these sources at scale.
Big data technologies, however, remove these constraints, making it possible to analyze and apply intelligence to multiple EHR feeds with clinical text as they stream in. In epidemiology, this takes the form of identifying patients — as they report symptoms of Zika for the first time — that might otherwise escape detection at point of care due to caregiver error or other factors. In the case of Zika, a single error or delay in diagnosis has high consequences.
Health systems are happy to invest in applying real-time computer intelligence as a backup to their trained professionals — especially when they are seeing tens of thousands of new patients each day.
And perhaps the most important application of big data is in the genomic dimension. The reason why some babies resist microcephaly and others don’t will be answered by the genome of the mother or the child, or both. This is true even if environment or drugs play a significant role. In the past, the absence of big data technologies and limitations in genetic sequencing meant researchers often had to pick a small number of genes to analyze when a health problem arose. This might be akin to having to pick a few "strategically guessed" clumps of hay in a stack while search for the needle.
Today, with big data technology in place, researchers can look through the entire genome, also known as whole exome or whole genome sequencing. This ability is game-changing: When it comes to solving a global health-care crisis like Zika, it is no longer plausible that genomic analytics will not play a major role.
Big data is not a cure — it can neither be injected by a caregiver nor provide the formula for a vaccine. But big data does have a vital role to play in piecing together a solution for any global health crisis.
Big data can be used to listen to traditional media, social media and adverse event channels for early signals that something is wrong, or to calculate the risk that the patient sitting next to you in the emergency room has Zika. It can collect the whole exomes of every newborn or look at outcomes in the whole population treated with the vaccine ongoing. Clearly, leveraging big data technologies are and will continue to be crucial during times of crisis.
Hadoop-based technologies can integrate detailed, complex and multi-structured data as it is generated from unlimited sources. It can identify patterns in that data and facilitate data discovery and analytics, ultimately helping to expedite the detection of the disease and outcome drivers, and enabling clinicians to deliver the best care based on real-time, data-driven decision-making. It can analyze whole genome data and merge multi-omic data with clinical/phenotype data quickly and efficiently, shortening R&D cycles and allowing us to find treatment and prevention methods faster. And it can measure and evaluate the impact of treatments at scale, suggesting improvements along the way to optimize outcomes and eventually, hopefully, eliminate viruses like Zika altogether, faster than we ever could before.
Shawn Dolley is Cloudera's Health & Life Science Big Data Expert. Shawn co-led the Healthcare & Life Science industry practice at Netezza, and when acquired by IBM was a part of IBM's Public Sector Big Insights industry practice. Shawn designed a Health Outcomes Analytic Appliance in conjunction with Brigham & Women's Hospital and Harvard Medical School.