Generative design of sequence specific DNA binding proteins
Generative design of sequence specific DNA binding proteins
Sehgal, E.; Politanska, Y.; Mitra, R.; Kim, P. T.; Gonzalez Rodriguez, N.; Warrier, T.; Kubaney, A.; Morishita, A.; Quijano, R.; Butcher, J.; Krishna, R.; Pecoraro, R.; Belmont, B.; Roullier, N.; Goreshnik, I.; Vafeados, D. K.; Kwon, P.; Ramarao, R.; Taipale, J.; Glasscock, C. J.; Baker, D.
AbstractDe novo protein design has advanced rapidly in recent years, yet the programmable recognition of specific DNA sequences remains a longstanding challenge. Here we describe a deep learning based approach for designing sequence selective DNA binding proteins. Our method combines structure generation using RFdiffusion3 with explicit screening against off-target interactions using AlphaFold3. We test this approach by generating 96 designs for each of 15 diverse DNA targets and identify specific binders for 7 targets, representing a ~100-fold improvement in success rates over previous approaches. We further characterize the binding landscape using variant competition assays and randomized library screening, revealing robust sequence discrimination across diverse targets. Together, these results represent a significant step forward in de novo sequence specific DNA binder design.