UW Genome Sciences Hackathon Leads to New Long-Read Sequencing Technology

‘An opportunity to test a wide range of machine learning approaches’

Share:

) Anupama and Mitchell Anupama Jha (left) and Mitchell Vollger: 'Long-read sequencing has the potential to dramatically advance our understanding of the genetic basis for human diseases, but one of the major challenges is sorting out which of the millions of genetic variants each individual harbors is actually impacting their health.'

“Solving complex problems needs a community,” said the UW’s Andrew Stergachis, M.D., Ph.D., a member of the Brotman Baty Institute. “One lab could not have accomplished what we did. This highlights the wonderful, collaborative work that was enabled by the first UW Genome Sciences hackathon.”

That “wonderful, collaborative” work in September 2022 led to “fibertools,” a “state-of-the-art” toolkit enabling faster and more accurate understanding of how genetic variation impacts human health and disease using long-read sequencing.

  • How much faster? One-thousand times faster.
  • How successful? The technology, thus far, has been downloaded nearly 40,000 times.
  • Impact on health? The tool has already enabled the discovery of the mechanistic basis of two rare human diseases as part of the National Institutes of Health (NIH) Undiagnosed Diseases Network and the GREGoR Consortium (Genomics Research to Elucidate the Genetics of Rare diseases).

An article on the power of “fibertools” was recently published in the journal Genome Research.

Founded in 2022, the Genome Sciences hackathon is a 5-day event that brings together trainees and faculty to address some of the more challenging genomic problems that require teamwork to solve.

“Participants praised the event for the opportunity to collaborate across labs, learn new skills, and to foster community building in a constructive goal-oriented fashion,” said Sayeh Gorjifard, a post-doc in the Queitsch lab, who organized and produced the first hackathons for Genome Sciences. “They also praised the event for its ability to enable trainees to test different projects and collaborate with people outside of their labs. It is a testament to how powerful student-led grassroots initiatives can be in moving science and the scientific community forward.”

Mitchell Vollger, Ph.D., a post-doc in the Stergachis lab, played a key role in the hackathon team that created “fibertools.”

“Long-read sequencing has the potential to dramatically advance our understanding of the genetic basis for human diseases, but one of the major challenges is sorting out which of the millions of genetic variants each individual harbors is actually impacting their health,” said Vollger, the paper’s corresponding author. “Our lab has pioneered an approach that helps solve this problem by essentially ‘spray painting’ the genome to see what is physically bound to it and then identifying ‘stencils’ that are uniquely present in an individual and, hence, indicate a region that may harbor a genetic variant that could be impacting their health. However, the computational tools for identifying these ‘stencils’ were slow, expensive to use, and did not work with the latest sequencing technologies.”

As part of the hackathon, a team of 11 researchers across seven labs at the UW tested whether they could use advanced machine learning techniques developed in the Noble lab to improve the accuracy and speed of this approach. This team was led by Vollger and post-docs Anupama Jha and Stephanie Bohaczuk, as well as Connor Finkbeiner, Morgan Hamm, Tony Li, Alan Min, Elliott Swanson, Dale Whittington, Stergachis, and Bill Noble.

“The hackathon provided an opportunity for us to test a wide range of machine learning approaches, enabling us to establish that this was a solvable problem, as well as the path for solving it.” said Jha, Ph.D., a post-doc in the Noble lab and first author on the Genome Research paper. “In the months after the hackathon, we finished creating “fibertools’, which we have proven is a generalizable and a fast approach for identifying genetic variants that impact human health using Long-Read sequencing. And, of course, we proved that it is very accurate.”

The science of “fibertools” will continue moving forward with the online publication of the paper in Genome Research, as well as a paper describing its use to solve the genetic basis of human disease recently published in the journal Nature Genetics.

Share: