Projects

Active learning links DNA sequence to condition-dependent gene expression

Closed-loop sequence-to-expression modeling to accelerate promoter engineering with fewer experiments.

  • Machine learning
  • Sequence-to-function
  • Active learning
  • Genomic language models
  • Predictive modeling
  • Promoter engineering

This project develops sequence-to-expression models for promoter design across environmental contexts. The workflow uses measured promoter activity and machine learning models, including genomic language model features, to propose new sequences for follow-up testing. The near-term goal is to evaluate whether active learning can reduce the number of experiments needed to identify condition-dependent promoter designs.

Promoter design workflow overview for sequence-to-expression active learning.

Work is ongoing, with additional modeling results, active-learning runs, and sequence-design examples to follow.