Drs. Doug Fowler, left, and Lea Starita: '(This paper) represents a pinnacle built on an enormous amount of effort across a long stretch of time, by many people – a lot of innovation, a lot of data.'
[EDITOR’S NOTE: BBI’s Drs. Lea Starita and Doug Fowler are co-corresponding authors on “A Scalable Approach to Resolving Variants of Uncertain Significance,” a preprint available on bioRxiv. The paper includes more than 70 authors from numerous research institutions and clinical facilities internationally. Here, Starita and Fowler discuss the paper’s findings and the collaboration behind them.]
Each of you has devoted nearly 15 years to studying genetic variants. What makes this paper – and its findings – unique?
Starita: This paper is the culmination of all those efforts. Researchers have been testing the effects of genetic variants in cells since the dawn of molecular biology. But it is only with the advent of next-generation sequencing – and with the ability to read and write DNA at scale – that we have been able to apply these functional assays to the problem of variants of uncertain significance.
Any person on the planet can walk into a clinic and receive a genetic test result that includes a variant we can’t interpret. We’re going to keep finding them. This is a never-ending problem. But what has changed in the past decade is the rise of computational models that can predict variant effects. Through these collaborations, we’ve figured out how to harness predictive data in a more accurate and responsible way to support variant classification. This paper demonstrates that revolution – so that more people receiving genetic testing will get actual answers, rather than simply being told, “We don’t know.”
Fowler: This paper is the culmination of work that began 15 years ago. It represents a pinnacle built on an enormous amount of effort across a long stretch of time, by many people – a lot of innovation, a lot of data. Unlike a typical research paper centered on a single dataset or experiment, this reflects decades of building technologies, scaling them toward a clinical goal, and assembling the collaborations needed to get there.
What makes this paper truly different is the scale of collaboration – in two senses. First, it’s a collaboration through time, with our past selves and all the work that has accumulated over 15 years. And second, it’s a collaboration across a remarkable range of people and institutions: clinicians, scientists, data generators, computational biologists, and colleagues at the Impact of Genomic Variation on Function Consortium, the National Human Genome Research Institute, and the software team at BBI.
The paper states that you and the other authors “developed a scalable workflow using only experimental and predictive evidence.” Can you explain that workflow and how you developed it?
Fowler: The workflow has three parts: acquiring the experimental and predictive data; a translation step called “calibration,” which converts raw data into evidence a clinician can actually use; and then applying that evidence to assess large numbers of genetic variants. Each of those steps has been done before in isolation. The real innovation of this paper is bringing them all together, and doing so at the scale of 40 genes. If you can do it for 40 genes, you can do it for 400 – or 4,000. What excites me is that this is the first real demonstration, at scale, of all these pieces working together, along with the infrastructure needed to keep them working.
Starita: The calibration step is critical. We take raw scores – from computational predictors or from lab-generated functional assays – and compare them against known clinical data. Historically, that translation has been a bespoke, labor-intensive process. As a result, collaborators on this paper developed more automated methods to transform data into clinical evidence. That distinction matters: data is what we generate in the lab; evidence is what clinicians need to act.
The future we’re working toward involves functional data for at least 1,000 clinically relevant genes within the next five to ten years. At that scale, we cannot afford bottlenecks where data sits waiting to be manually translated for clinical use. That process must be automated – and this paper lays out how.
The paper notes that in 2020, the National Human Genome Research Institute predicted that by 2030 the term “VUS” would become obsolete, and that individuals from ancestrally diverse backgrounds would benefit equitably from advances in human genomics. Is that prediction still relevant? How does this paper move science closer to that goal?
Starita: This paper illustrates the formula for making the promise of precision medicine more realistic. The equity dimension is particularly important. Historically, large-scale research projects have underrepresented minorities and non-European populations – which means people from those backgrounds are disproportionately likely to receive a VUS result when they undergo genetic testing. Functional and predictive data help address that gap. We are leveling the playing field.
Fowler: The 2020 prediction was meant to motivate people to make it a reality. If anything, it’s more relevant now than ever – the number of uncertain variants identified through genetic testing has continued to grow, which makes achieving that goal more urgent. This brings us back to the scalable workflow. We show that combining large-scale experimental data and computational predictions can resolve approximately 75 percent of variants of uncertain significance. Applying this workflow across all clinically relevant genes in the genome would not make the term “VUS” obsolete, but it would make VUS far less of a problem than they are today.