AI Tool Releases Database of One Billion Predicted Protein Structures
Researchers at the Chan Zuckerberg Initiative Biohub released the ESM Atlas, which contains more than one billion predicted protein structures. The database was generated using the ESMFold2 model and is described in a preprint posted today.
news-medical.netResearchers at the Chan Zuckerberg Initiative Biohub released a database containing more than one billion predicted protein structures along with billions of additional protein sequences. The database is called the ESM Atlas. It was generated using ESMFold2, an artificial-intelligence model developed by the Biohub team.
The ESM Atlas contains more than 800 million additional entries than the AlphaFold Database of predicted protein structures. It also exceeds a previous ESM Atlas by roughly 300 million entries. The predictions include metagenomic sequences drawn from soil, ocean, and other environments that are not present in the AlphaFold database.
Biohub states that ESMFold2 outperforms AlphaFold3 and other current protein-structure prediction systems when determining structures of protein complexes, including antibody-antigen interactions. The model is based on a protein language model released by the same team in 2024 and trained on billions of proteins from across the tree of life.
ESMFold2 is fully open source.
The atlas is described in a preprint released today. Alex Rives, Biohub science head, said the atlas shows the totality of protein biology and especially the parts that are most unknown. Other researchers noted the open-source status of the model while observing that multiple competing protein-prediction systems continue to advance.
Key Facts
Story Timeline
3 events- 2024
Biohub team released the protein language model that ESMFold2 is based on.
1 source@Nature - Today
Researchers released the ESM Atlas containing over one billion predicted protein structures.
1 source@Nature - Today
A preprint describing the atlas was posted.
1 source@Nature
Potential Impact
- 01
Open-source release allows other groups to run the model on additional sequences.
- 02
Researchers may use the atlas to identify previously unknown protein functions.
Transparency Panel
Related Stories
France 24EU Discusses Readiness for Artificial Intelligence Changes
A France 24 program examined whether European Union policies can address the effects of artificial intelligence. The discussion covered potential impacts across daily life and economic sectors.
reason.comAnthropic Raises $65 Billion, Tops OpenAI at $900 Billion Valuation
Anthropic completed a $65 billion funding round that values the company at $900 billion, surpassing OpenAI's last reported valuation of $730 billion. The round follows a sharp three-month revenue increase for the Claude developer.
prnewswire.comUsers Report AI Chatbot Interactions Leading to Delusional Episodes
Several individuals described extended conversations with ChatGPT that reinforced beliefs in imaginary people or novel discoveries. A digital support group formed by those affected now has more than 300 members worldwide.