IE 11 Not Supported

For optimal browsing, we recommend Chrome, Firefox or Safari browsers.

Scientists Unveil Groundbreaking Human Genome Sequence

A large group of scientists, led in part by geneticists at University of California Santa Cruz, has outlined the first complete human genome. This breakthrough could lead to many other new discoveries about health.

DNA sequence
(TNS) — An international team of scientists, led by geneticists at UC Santa Cruz and the National Institutes of Health, has published the first truly complete human genome, a dramatic advance in understanding the role of genetics in disease and evolution that closes the last gaps in cataloging the 3 billion paired molecules that make up our DNA.

The full genetic sequence — first reported last summer, but officially unveiled in the journal Science on Thursday — finally fills in the roughly 8% of the human genome that scientists had been unable to work out since publishing their first draft more than 20 years ago.

The effort involved an unprecedented, grassroots collaboration of scientists around the world — many working remotely during the pandemic — who applied novel sequencing technology to sort through massive, previously unmapped strands of DNA located primarily in the dense centers of human chromosomes.

The work thrusts into sharp detail areas of the genome that previously were cast in shadow and largely unexplored. One scientist involved in the project compared it with seeing the first high-resolution photographs of distant planets after decades of only having access to blurred, black-and-white images — or no visibility at all.

"For decades we've had large, persistent gaps in our reference genome and in our understanding of how the genome works," said Karen Miga, a UC Santa Cruz geneticist who helped guide the new project. "These gaps span functionally important, dare I say critical parts of our genome. Having these maps hopefully will drive new biology in new directions."

The work was coordinated by the Telomere to Telomere Consortium that Miga co-founded with Adam Phillippy, head of the genome informatics section at the National Human Genome Research Institute in Bethesda, Md., part of the NIH. Telomeres are sections of DNA that cap the ends of each chromosome, so the consortium name came from the goal of mapping the full stretch of DNA, from one end to the other.

The complete sequence means scientists can now confirm that the human genome is made up of 3.055 billion base pairs — the molecular building blocks referred to by the letters A, C, G and T that form the genetic recipe for building humans from a single cell — which are sorted into 19,969 functionally useful genes.

The Telomere to Telomere group added 200 million base pairs and 115 active genes to the most recent draft of the genome, which was finished almost a decade ago and updated in 2019. Scientists on the project compared the size of the gap they filled to finding an entire continent on a world map.

"We've added more than 200 million new bits of information to the genome — hundreds of millions of bits about what makes us human," said Nicolas Altemose, a postdoctoral fellow at UC Berkeley who was part of the consortium. "It's been a massive amount of new information to process and annotate. We're only just beginning to figure it all out, but getting this first clear glimpse of what's in those regions has been incredible."

The sequence provides the first full account of key parts of human chromosomes, including the centromeres — dense collections of DNA strands that make up the middle of the chromosome — and the telomeres at either end. Chromosomes are wormlike strands that contain the DNA instructions; humans have 23 pairs of them.

The complete sequence should help scientists better understand the underlying genetics of all kinds of human disease, including cancer. It also provides insight into human evolution, as the regions of the chromosome that were unmapped before contain some of the greatest genetic diversity.

"There are all these genes and arrangements of genes that we've never seen before. There will undoubtedly be discoveries about what they do and their role in disease," said Benedict Paten, a biomolecular engineer at UC Santa Cruz who was part of the consortium.

Dr. Euan Ashley, a professor of medicine at Stanford who has used the incomplete genome to help diagnose rare genetic disorders and for other clinical applications, called the full sequencing accomplishment a "tour de force" that could have enormous implications for disease diagnosis, treatment and prevention.

"Applying the genome to medicine has been transformative, and that was achievable with a much less complete [genome] reference," Ashley said. "Any time there's an improvement like this, that's something to be celebrated."

More than 100 scientists joined the consortium to complete the genome, many of them as a sort of passion project, working outside their usual laboratory hours. The lead scientists said it was hard to calculate the total cost because of how the resources were spread out, but all told it was a few million dollars. The first draft of the genome cost about $300 million to generate.

The complete sequence was done on a rare type of tumor called a hydatidiform mole — a nonviable human embryo that is made up of only paternal DNA from the sperm. That made it simpler to sequence, because scientists didn't have to separate two sources of DNA.

Earlier drafts of the genome were incomplete largely because the gaps were made up of extremely long strands of repetitive DNA that were impossible to place. Scientists compare it with trying to finish a very large jigsaw puzzle with tiny pieces that are all the same color and shape.

"If you think of puzzle of a beautiful landscape, you start with the edges and the unique colors, and you leave the blue sky for last because the pieces all look the same. That's what repetitive sequencing is like," Altemose said.

The first drafts were done by sequencing small strands of DNA — a few hundred letters at a time — and looking for sections that overlapped. Those sections would then be stitched together, like pieces of a quilt, until a near-complete genome was done.

New technology capable of sequencing much larger strands of DNA — up to tens of millions of letters — made it possible to finally place the repetitive sections in the genome. Again with the puzzle analogy, it's like scientists were able to work with much larger pieces to finally finish the sky.

A next step for the consortium is to complete analyzing many more genomes — including those of actual humans, and not a failed embryo. One group of scientists is working on a pangenomic project that seeks to sequence 350 genomes collected from around the world, which would capture a robust view of human genetic diversity.

Scientists who were part of the original human genome effort celebrated the unveiling of the full genome — and the completion of the work they began more than two decades ago.

"The human genome reflects the result of 4 billion years of struggles by our ancient ancestors to pass on a message written in DNA from parent to offspring — a whole recipe for a human being. To complete the genome is a scientific milestone of enormous importance," said David Haussler, director of the UC Santa Cruz Genomics Institute, who was involved in the first human genome project but not part of the telomere consortium.

"Understanding what's in the genome is a completely different beast," Haussler added. "It's very hard to read. But the book is now published."

©2022 San Francisco Chronicle, Distributed by Tribune Content Agency, LLC.