The capsid of Adeno-associated Virus (AAV) is a naturally occurring, replication-deficient, virus that is widely considered the frontrunner for solving the delivery problem in gene therapy. These viruses are known to be harmless to humans, and are relatively simple to manipulate. One well-known drawback of natural capsids however, which are currently used for delivery, is that many patients with pre-existing immunity to the virus (due to previous natural exposure) may be ineligible for life-changing treatment.
In previous work (published in Science), we validated the use of computational models in conjunction with high-throughput experiments to design better liver-targeting variants of naturally occurring AAV capsids. In that work we were primarily focused on single edits to the capsid, and hypothesis that the effect of a combination of single mutations, at least when the number of total edits is limited, can be approximated by the sum of the effect of each mutation. Through this approach we validated that a model-guided method can lead to more efficient design of better capsids for more effective liver targeting.
The paradigm of measuring the effects of mutations independently and combining the best ones no longer works as we attempt to modify the capsid beyond a handful of mutations. Making capsids with many changes relative to natural variants increases our chances of being able to treat the thousands of potential recipients of gene therapy by evading pre-existing immunity. To achieve the ability to introduce a large number of changes to the capsid sequence without breaking its essential abilities, a wholly new approach was needed, which our latest study in Nature Biotechnology aims to address. Our goal was to design highly diverse AAV capsids, for which we used much more advanced machine learning models and trained on more complex datasets The work was a result of years of collaboration between teams at Dyno, Harvard’s Wyss Institute, and Google Research.
To test these methods, we focused on a representative region of the capsid (positions 560-588, seen in pink in the fully assembled virus, the hexamer assembly, and the individual subunit in the figure above) that had both surface-exposed and buried residues (Generally speaking, surface-exposed residues are known to be more mutation-tolerant) This region is also well known for the presence of immunogenic structures, as well as its role for tissue targeting. Our aim was to introduce as many mutations as we could in this 28 amino-region, including substitutions and insertions, the latter of which is a less common type of mutation in nature. When we started this study, it was unknown if machine learning models would be reliable for predicting the effects of mutations for variants beyond 5-10 edits to the original sequence. We expected this was possible, however, based on analyzing the diversity of sequences that have been isolated from natural sources. In this region, the average difference between two AAV serotypes is 12 amino-acids (often with few or no insertions). Nonetheless, we pushed the models to propose sequences with up to 29 substitutions and insertions.
Using the naturally observed level of diversity as a benchmark, we set our goal to generate diversity beyond that observed in nature, while maintaining the capsid’s viability. After screening billions of potential sequences in-silico using machine learning models, we settled on ~200,000 designed variants which we experimentally tested for their viability. Of those, approximately 110,000 produced viable viruses (many of our attempts were deep into the sequence space, where it is very hard to propose viable viruses). About 57,000 variants were farther than 12 mutations away from the AAV2 serotype. By generating more than two thousand sequences that were 25 or more mutations away, we decisively demonstrated the power of machine learning models to design diverse synthetic capsid sequences.
In this study, largely conducted before Dyno’s official launch, we report one of the largest AI-driven protein design assays published to date and validated the utility of these techniques for capsid engineering. The success of this approach bolstered our confidence in Dyno’s foundational science. Building upon this foundation, we have established infrastructure and machine learning techniques at Dyno to expand and optimize the AAV repertoire for multiple traits (including in-vivo targeting of challenging tissues), multiple serotypes, and at a larger scale. This study is just the beginning of our endeavour.
This work was a multi-year collaboration between Dyno co-founders Eric Kelsic, Sam Sinai, and George Church, colleagues at Harvard’s Wyss Institute including Nina Jain and Pierce Ogden and members of the Google Accelerated Science team including Patrick Riley, co-first authors Drew Bryant, Ali Bashir, and co-corresponding author Lucy Colwell.
Delivering gene therapy’s promise
January 26, 2021
The human body consists of trillions of cells, each possessing their own copy of DNA that provides the blueprint for making a human. Despite sharing the same DNA, different cells activate a varied mix of genes within them to enable differentiation into various tissues and organs. For millions of people, unfortunately, some of these genes do not function properly, resulting in diseases and conditions that are sometimes severely debilitating.
Currently, there are thousands of diseases of different organs that can be cured if the set of cells that rely on the defective gene are provided with a healthy copy of it. However, a significant obstacle in achieving such a cure is that it is challenging to deliver genes into the correct set of cells. The most promising route that is currently available uses AAV capsids, protein shells derived from the naturally benign human Adeno-associated Virus (AAV), as a vector to carry the healthy copy of the gene to the diseased organ or tissue. Unfortunately, natural AAV capsids have not evolved to be specialized in the way needed to deliver therapeutic genes. Most AAV capsids are not able to target a specific organ or tissue that a particular treatment requires. Moreover, the human immune system is familiar with these natural strains, and can quickly neutralize them when used for gene therapy, preventing them from reaching their destination.
To circumvent this problem, the traditional way of engineering natural viruses has been to evolve them in the lab, generating new synthetic viruses that are both novel to the immune system and have higher affinity to the target tissue. Apart from being time-consuming, this process scales poorly as the lessons learned from one experiment don’t generalize to others. There are multiple reasons for this. For instance, the process of generating novelty to select from cannot be precisely controlled during an experiment, but rather occurs by random genetic mutations. With this approach, the traits of interest might simply be too rare to find with random variation. Additionally, using experimental selection processes to improve a specific property often presents a risk of losing other desirable properties of a capsid that are not being selected. An evolutionary path taken in one experiment may not be repeatable for other traits, and the experimental heuristics learned could be irrelevant for other properties. Given these significant shortcomings, success in the field has been limited to those experiments which just happened to produce good results for a particular therapy.
Dyno has instead opted to build a platform that aims to solve the delivery problem for all therapies, thereby removing a major obstacle for realizing effective gene therapies across thousands of diseases. Dyno’s model is built on the recognition that designing capsids capable of delivering genes to different organs and tissues share commonality that if captured, enables the design of specialized vectors quickly and efficiently. To pursue this opportunity, our platform combines two cutting-edge technologies:
Data gathered from these validation experiments help making Dyno’s AI smarter, which in turn can help produce better and more diverse viruses. Dyno contends that over time, its AI system will become smart enough to design viruses that are capable of targeting any organ or cell type in the body and predict whether it will work in humans, without the need for multiple experiments. The efficiency of our AI platform only grows as we engage with more partners to produce delivery vectors for different tissues and indications.
This is why Dyno’s approach has generated interest from companies that are industry leaders in gene therapy. Better delivery vectors are impactful and easy to adopt for current and future therapies. Our partners recognize the value of this technology and are helping to validate this approach – all the way into the clinic. Together with our partners, we hope to reach new heights in the gene therapy landscape and unlock its promise for the millions of patients whose life would be transformed by effective delivery of a new gene.