What is Sources & Further Reading?

Nolot, E. (2026, April 28). Eli Lilly strikes gene editing deal with Profluent, an AI company designing novel CRISPR proteins. STAT News. https://www.statnews.com/2026/04/28/eli-lilly-crispr-gene-editing-deal-profluent-ai/

Beyond CRISPR: How AI Is Designing Entirely New Gene Editors

Q: What Profluent Does?

Profluent Bio, founded in 2022 and headquartered in Berkeley, California, is an AI-first protein design company specializing in gene editing enzymes. Its core technology adapts the architecture of large language models — the same class of neural networks that powers modern generative AI — to the problem of protein sequence design.

When Eli Lilly announced a collaboration with Profluent Bio on April 28, 2026, the headlines focused on the deal economics and the gene editing angle. Both are interesting. But the more important story is buried one level deeper: Lilly is not licensing a CRISPR enzyme. It is betting on gene editors that were never discovered in nature — proteins that were imagined, designed, and built by artificial intelligence, starting from a blank page.

That distinction matters enormously for where the gene editing field is heading. The natural CRISPR toolbox, as rich as it has proven to be, is a finite resource. Researchers have now spent more than a decade mining bacterial and archaeal genomes for useful nucleases, and while the discoveries keep coming — compact Cas12f variants, CRISPR-Cas3, bridge RNAs — each is constrained by the evolutionary pressures that produced it. Bacteria did not evolve CRISPR to treat human disease. They evolved it to defend against bacteriophage. Every natural enzyme carries the fingerprints of that evolutionary history: PAM requirements tuned for bacterial DNA, protein structures optimized for bacterial chromatin, immunogenic surfaces shaped by billions of years of phage arms races.

Profluent is proposing a different path entirely. Its platform uses large language models trained on protein sequence databases to generate novel nucleases from scratch — enzymes that have no natural ancestor, no evolutionary baggage, and properties that can be specified in advance. The Lilly deal is the first major signal that large pharmaceutical companies believe this approach has matured past the research curiosity phase and into something worth building drugs around.

What Profluent Does

Profluent Bio, founded in 2022 and headquartered in Berkeley, California, is an AI-first protein design company specializing in gene editing enzymes. Its core technology adapts the architecture of large language models — the same class of neural networks that powers modern generative AI — to the problem of protein sequence design.

The insight driving the company is a conceptual parallel: just as language models learn the statistical structure of human text by training on billions of words, protein language models can learn the statistical structure of protein sequences by training on the vast databases of known proteins. Once a model has internalized those patterns — the rules that govern which amino acid sequences fold into stable, functional proteins — it can generate entirely new sequences that satisfy those rules while optimizing for properties that no natural protein possesses.

For gene editing, that means Profluent can, in principle, design nucleases with any combination of characteristics: a specific size to fit within a delivery vector, a PAM sequence chosen to maximize genome coverage, a protein surface engineered to minimize recognition by the human immune system, or an active site geometry tuned for precision insertion rather than blunt cutting.

The company's platform is not purely generative. It uses an iterative loop in which AI-designed sequences are synthesized, tested in cells, and the results fed back into the model. Each experimental cycle tightens the model's understanding of which designs work and why. This is closer to directed evolution than to pure computational design — but it operates at a scale and speed that no directed evolution campaign in a test tube could match.

OpenCRISPR: The Proof of Concept

In April 2024, Profluent released what it called OpenCRISPR-1, the first open-source AI-designed gene editing protein. This was a deliberate proof-of-concept: a functional nuclease generated entirely by the company's AI platform, released publicly with no strings attached, to demonstrate that the technology worked and to accelerate the broader scientific community's ability to evaluate it.

OpenCRISPR-1 showed editing activity in human cells, confirmed that the AI-designed protein could navigate the complexity of human chromatin and execute double-strand DNA cleavage, and established that the company's design pipeline could produce real, functional enzymes — not just proteins that looked good on a computer screen. The release was notable precisely because Profluent had nothing commercial to protect in releasing it; the company's value lies in the platform that generated the protein, not in any single protein itself.

OpenCRISPR demonstrated something more subtle too. Because the protein was generated rather than discovered, its immunological profile is different from Cas9 or any other natural CRISPR enzyme. Human immune systems have had evolutionary time to develop recognition of natural bacterial proteins, including Cas9 variants — pre-existing immunity to Cas9 has been a genuine clinical concern, with studies finding that significant proportions of human populations carry antibodies against SpCas9 and SaCas9. A de novo AI-designed protein has no such history. It is, from the immune system's perspective, something genuinely new.

The Limits of Natural CRISPR Proteins

To understand why an AI-first approach is attractive, it helps to understand what natural CRISPR proteins cannot do — or cannot do well.

The PAM constraint

Every CRISPR-Cas nuclease requires a short DNA sequence adjacent to its target, called the protospacer adjacent motif (PAM). Cas9 from Streptococcus pyogenes requires an NGG PAM; roughly 10-20% of any given genomic region lacks a well-positioned NGG PAM, meaning Cas9 cannot target those sites. Other natural enzymes have different PAM requirements, but each narrows the targetable space in its own way. Natural enzymes were not designed to maximize PAM flexibility; they evolved PAM recognition as part of their self/non-self discrimination system in bacteria, and human genomic coverage was never a selection pressure.

AI-designed enzymes can be optimized for any PAM — including minimal PAM requirements or PAM-independent activity — as a design specification from the outset.

Immunogenicity

As noted above, pre-existing immunity to Cas9 is a documented concern in gene therapy clinical trials. A 2018 study found anti-Cas9 antibodies in roughly 79% of donors tested for SaCas9 and 58% for SpCas9, reflecting prior bacterial infections by the organisms these enzymes come from. This creates real risks for systemic delivery and complicates repeat dosing.

Proteins designed by AI can be generated with surface properties that minimize immunogenic epitopes — though this is an active area of research and proof of the advantage in humans has not yet been demonstrated in clinical trials.

Size and packaging

The packaging constraints of adeno-associated virus vectors, the workhorse of in vivo gene delivery, impose a hard ceiling on how large a gene editing payload can be. SpCas9 consumes the vast majority of an AAV's 4.7 kb capacity. Natural compact alternatives like Cas12f have lower editing efficiency at many loci. AI-designed proteins can be specified to hit a target size window — small enough for single-AAV delivery but large enough to maintain the structural features that drive high efficiency.

The insertion problem

Perhaps the most important limitation, and the one most directly relevant to the Lilly deal, is the difference between gene knockout and gene insertion.

Most CRISPR clinical programs to date have targeted gene knockout — disabling a harmful gene, or disrupting a regulatory element to reactivate a silenced gene. Knockout is relatively tractable because it exploits the cell's natural non-homologous end joining (NHEJ) repair pathway, which efficiently seals double-strand breaks and in doing so frequently introduces small insertions or deletions (indels) that disrupt the reading frame of a gene. You cut, the cell repairs sloppily, the gene is broken. Reliable frequencies of 50-90% or more are achievable.

Gene insertion — placing a working copy of a gene into a precise genomic location — is fundamentally harder. It requires the cell to use a different, less active repair pathway: homology-directed repair (HDR) for precision insertion, or the newer homology-independent targeted insertion (HITI) approach, or prime editing's peg-in-hole system. All of these pathways are less efficient in post-mitotic cells (neurons, cardiomyocytes, hepatocytes) precisely because they rely on machinery that is most active during cell division.

Gene insertion is, however, medically necessary for a specific and important class of genetic diseases: gain-of-function diseases, where a mutant protein actively causes harm rather than merely being absent. Huntington's disease, many forms of amyotrophic lateral sclerosis, and certain autosomal dominant conditions require not just silencing the bad copy but potentially replacing it with a functional one. More broadly, diseases caused by the absence of a needed protein — hemophilia, lysosomal storage disorders, various monogenic metabolic diseases — require restoring a working gene, not just disrupting a harmful one.

This is where the Lilly deal appears to focus. The precise terms of the collaboration have not been fully disclosed, but the strategic logic points toward gene insertion in therapeutically important tissues. And it is here that AI-designed enzymes offer a concrete advantage: insertion efficiency depends heavily on the precise geometry of how the nuclease cuts DNA, including the overhang structure, the nick configuration, and the physical accessibility of the cut site to the cell's repair machinery. These are properties that can be specified and optimized computationally, rather than accepted as given by natural enzyme evolution.

The Eli Lilly Deal

The collaboration, reported by STAT News on April 28, 2026, pairs Lilly's drug development infrastructure with Profluent's AI protein design platform. The financial terms disclosed publicly are limited, which is typical for early-stage biotech collaborations where both parties prefer not to signal the value of the underlying technology to competitors.

What is clear is the strategic framing on Lilly's side. The company is not approaching gene editing as a spectator. Eli Lilly acquired Verve Therapeutics in July 2025 for approximately $1.3 billion, absorbing Verve's base editing platform targeting cardiovascular disease — most notably the VERVE-101 and VERVE-102 programs aimed at permanently reducing LDL cholesterol through single-dose base editing of the liver. That acquisition gave Lilly a proven framework for base editing but left a gap: base editors excel at single-nucleotide changes but are less suited for large gene insertions.

The Profluent collaboration fills a different part of the genetic medicine map. If Verve gives Lilly precision single-letter editing for loss-of-function programs, Profluent's AI-designed nucleases could, in principle, give the company a route to inserting whole gene cassettes at defined genomic safe harbors — a capability that would complement rather than duplicate the base editing platform.

The deal also signals something about Lilly's competitive posture. The pharmaceutical company has watched the gene editing field consolidate rapidly over the past three years, with Vertex Pharmaceuticals and CRISPR Therapeutics capturing the first approved CRISPR gene therapy (Casgevy, for sickle cell disease and beta-thalassemia) and a wave of in vivo programs advancing at Intellia, Editas, and others. Lilly's gene editing strategy is not about me-too CRISPR development — it is about securing access to next-generation tools before competitors lock up the key platform relationships.

What this is not

It is worth being precise about what the Profluent deal does not represent. This is a research collaboration, not a clinical-stage acquisition. Profluent's AI-designed editors have been validated in human cells in culture and at the protein characterization level, but they have not entered clinical trials. No AI-designed gene editor has yet been tested in a human patient. The path from "works in a dish" to "approved drug" involves years of preclinical safety and efficacy work, IND-enabling studies, Phase 1 dose escalation, and the full gauntlet of clinical development that every gene therapy program must survive.

The Lilly deal is, at this stage, a bet on platform potential — a strategic option that costs a fraction of a full acquisition and gives Lilly access to technology that could matter significantly if the science continues to advance.

The Competitive Landscape

Profluent is not the only company using AI to design or optimize gene editing proteins, but it occupies a distinctive niche. Most AI-protein design companies have focused on therapeutics like antibodies, enzymes, or small proteins. Applying generative AI specifically to nuclease design — proteins that must fold correctly, locate a specific DNA sequence inside a complex nucleus, cleave DNA with precision, and do so without triggering catastrophic off-target damage — is a harder and more specialized problem.

The closest structural competitors include:

Dyno Therapeutics — focused on AI-designed AAV capsids rather than editing proteins. Dyno's platform generates novel AAV capsids with improved tissue targeting and reduced immunogenicity. The company partners with GSK, Sarepta, and Novartis, and represents the same AI-protein-design-for-gene-therapy thesis applied to the delivery vehicle rather than the editor itself.

AbSci — a broader protein design company that has demonstrated AI-generated antibodies that validated in wet lab testing. AbSci's platform is not CRISPR-focused but represents the maturing state of generative AI protein design across the industry.

EvolutionaryScale — launched by former Meta AI researchers, this company has built ESM (Evolutionary Scale Modeling) protein language models and released the ESMFold structural prediction model. The company's work is foundational to the field, and its models power aspects of protein design at several biotechs.

Metagenomi — takes a different approach, mining metagenomics datasets for novel natural CRISPR systems rather than designing them de novo. The company has partnerships with Moderna and has identified diverse nucleases with unique properties. This is still discovery-from-nature, not generation from scratch.

Tessera Therapeutics — developing Gene Writing technology based on mobile genetic elements for large gene insertion, competing at the application level (gene insertion in particular) if not at the platform level.

The distinction that matters competitively is between discovering natural proteins (Metagenomi, most academic CRISPR groups) and generating novel proteins (Profluent, with EvolutionaryScale's models as enabling infrastructure). The generating approach is earlier-stage and riskier but unconstrained by what nature happened to evolve.

Risks and Unknowns

It would be a mistake to read the Lilly deal as validation that AI-designed gene editors are ready for the clinic. They are not — yet. The honest accounting of risks is substantial.

No clinical track record

No AI-designed gene editor has been tested in a human. Every stage of clinical development brings surprises that cannot be anticipated from cell culture or animal data. Immunogenicity in particular is notoriously difficult to predict — a protein that appears non-immunogenic in mice or in in vitro assays may trigger unexpected immune responses in human patients, especially after systemic administration.

Off-target activity

Natural CRISPR enzymes have well-characterized off-target profiles that researchers have spent years developing tools to measure and mitigate — GUIDE-seq, CIRCLE-seq, CHANGE-seq, and a battery of other unbiased genome-wide assays. AI-designed enzymes are novel enough that the off-target landscape is not automatically predictable from any prior dataset. Full characterization requires the same experimental work, plus the additional challenge that there is no evolutionary precedent to draw on for predicting behavior.

The efficiency gap in vivo

High editing efficiencies in human cells in culture do not always translate to equivalent efficiencies in animal models or in patients. The in vivo delivery environment — immune surveillance, physical barriers to entry, the crowded reality of chromatin inside a nucleus in a living tissue — is vastly more complex than what a cell culture experiment captures. Gene insertion specifically remains inefficient in post-mitotic tissues even with the best-optimized natural editors, and AI design has not yet demonstrated a clear path past that fundamental biological constraint.

Regulatory novelty

Regulatory agencies including the FDA evaluate gene therapy products through established frameworks that were developed with natural-origin proteins in mind. AI-generated proteins are functionally similar — they are proteins — but the manufacturing and characterization requirements are evolving. This is not a showstopper, but it adds regulatory uncertainty to an already complex development path.

Platform versus product risk

Profluent's value is in its platform. A research collaboration with Lilly is encouraging, but it does not validate any specific drug candidate. Platform companies can be right about the technology and still fail to produce approved drugs if any single program stumbles in development. The commercial success of an AI-designed gene editor requires not just that the platform generates good proteins, but that one specific protein becomes a medicine that works in patients.

What This Means for the Field

The Lilly-Profluent deal is directionally significant even if its immediate outputs are uncertain. It represents several shifts that are worth tracking.

Big pharma is now a buyer of AI protein design, not just a curious observer. Lilly's gene editing acquisition strategy — Verve for $1.3 billion, now a Profluent collaboration — signals that genome medicine has moved from a specialty biotech curiosity into the core pipeline strategy of large pharmaceutical companies. When pharma buys, the technology has typically already cleared the "is this real" threshold in scientific circles.

The natural CRISPR toolbox has a real competitor. For the past decade, the gene editing field has been organized around a taxonomy of natural enzymes: Cas9, Cas12a, Cas12f, Cas13, base editors built on Cas9 scaffolds, prime editors built on nicked Cas9. Profluent's approach challenges the assumption that the natural toolbox is the only starting point. If AI-designed editors can routinely match natural editors on efficiency, surpass them on immunogenicity, and be specified for delivery properties from the design stage, the competitive advantage of mining nature's databases weakens.

Gene insertion programs will accelerate. The difficulty of gene insertion has held back a large segment of genetic disease programs. Companies with platforms specifically optimized for efficient insertion — whether through AI-designed nucleases, prime editing, base editing in tandem, or Gene Writing approaches — are going to attract disproportionate investment as the knockout programs mature and the next frontier becomes clear.

The immunogenicity problem may finally get a tractable solution. Pre-existing immunity to natural CRISPR proteins is a genuine clinical constraint, particularly for systemic delivery and repeat dosing. AI-designed proteins with novel surfaces are the most credible approach to solving this problem at the protein level, rather than through immunosuppression or other secondary interventions. If Profluent or others can demonstrate reduced immunogenicity in primate models or early clinical data, that single data point would substantially increase the value of the entire platform category.

The honest answer to the question "will AI-designed gene editors reach patients?" is: almost certainly yes, eventually — but the timeline is measured in years, the attrition rate in clinical development is high, and the specific proteins that become drugs may bear little resemblance to today's proof-of-concept designs. What the Lilly deal signals is that the field has moved from asking whether AI protein design works to asking what it can produce that natural discovery cannot.

That is a meaningful shift. It will take another decade to fully measure.

Sources & Further Reading

Nolot, E. (2026, April 28). Eli Lilly strikes gene editing deal with Profluent, an AI company designing novel CRISPR proteins. STAT News. https://www.statnews.com/2026/04/28/eli-lilly-crispr-gene-editing-deal-profluent-ai/
Profluent Bio. (2024, April). OpenCRISPR: An open-source AI-generated gene editor. Profluent.bio.
Anzalone, A. V., Koblan, L. W., & Liu, D. R. (2020). Genome editing with CRISPR–Cas nucleases, base editors, transposases and prime editors. Nature Biotechnology, 38, 824–844.
Charlesworth, C. T., et al. (2019). Identification of preexisting adaptive immunity to Cas9 proteins in humans. Nature Medicine, 25, 249–254.
Eli Lilly and Company. (2025, July). Lilly completes acquisition of Verve Therapeutics. Investor relations press release.
Komor, A. C., Badran, A. H., & Liu, D. R. (2017). Editing the genome without double-stranded DNA breaks. ACS Chemical Biology, 13(2), 383–388.
Madani, A., et al. (2023). Large language models generate functional protein sequences across diverse families. Nature Biotechnology, 41, 1099–1106.
Ryu, S. M., et al. (2018). Adenine base editing in mouse embryos and an adult mouse model of Duchenne muscular dystrophy. Nature Biotechnology, 36, 536–539.
Anzalone, A. V., et al. (2022). Programmable large-scale DNA insertion using prime editing. Nature Biotechnology. https://doi.org/10.1038/s41587-022-01600-4