Journal article

Sequence-structure-function relationships in the microbial protein universe


Authors listLeman, Julia Koehler; Szczerbiak, Pawel; Renfrew, P. Douglas; Gligorijevic, Vladimir; Berenberg, Daniel; Vatanen, Tommi; Taylor, Bryn C.; Chandler, Chris; Janssen, Stefan; Pataki, Andras; Carriero, Nick; Fisk, Ian; Xavier, Ramnik J.; Knight, Rob; Bonneau, Richard; Kosciolek, Tomasz

Publication year2023

JournalNature Communications

Volume number14

Issue number1

eISSN2041-1723

Open access statusGold

DOI Linkhttps://doi.org/10.1038/s41467-023-37896-w

PublisherNature Research


Abstract

For the past half-century, structural biologists relied on the notion that similar protein sequences give rise to similar structures and functions. While this assumption has driven research to explore certain parts of the protein universe, it disregards spaces that don't rely on this assumption. Here we explore areas of the protein universe where similar protein functions can be achieved by different sequences and different structures. We predict similar to 200,000 structures for diverse protein sequences from 1,003 representative genomes across the microbial tree of life and annotate them functionally on a per-residue basis. Structure prediction is accomplished using the World Community Grid, a large-scale citizen science initiative. The resulting database of structural models is complementary to the AlphaFold database, with regards to domains of life as well as sequence diversity and sequence length. We identify 148 novel folds and describe examples where we map specific functions to structural motifs. We also show that the structural space is continuous and largely saturated, highlighting the need for a shift in focus across all branches of biology, from obtaining structures to putting them into context and from sequence-based to sequence-structure-function based meta-omics analyses.

Advances in protein structure prediction have led to a significant influx of protein structure data. Here the authors exploit this data to offer an unbiased overview of complex sequence-structure-function relationships in the protein universe. This work opens up new uses for 3D structure data repositories in meta-omics and other fields of biology.




Authors/Editors




Citation Styles

Harvard Citation styleLeman, J., Szczerbiak, P., Renfrew, P., Gligorijevic, V., Berenberg, D., Vatanen, T., et al. (2023) Sequence-structure-function relationships in the microbial protein universe, Nature Communications, 14(1), Article 2351. https://doi.org/10.1038/s41467-023-37896-w

APA Citation styleLeman, J., Szczerbiak, P., Renfrew, P., Gligorijevic, V., Berenberg, D., Vatanen, T., Taylor, B., Chandler, C., Janssen, S., Pataki, A., Carriero, N., Fisk, I., Xavier, R., Knight, R., Bonneau, R., & Kosciolek, T. (2023). Sequence-structure-function relationships in the microbial protein universe. Nature Communications. 14(1), Article 2351. https://doi.org/10.1038/s41467-023-37896-w



Keywords


CONSENSUS PREDICTIONCOVERAGEFOLD SPACEIMMUNE-SYSTEMMULTIPLICITYSTRUCTURE SPACE

Last updated on 2025-18-07 at 09:26