Our group works on various problems connected with the functioning and evolution of biological systems. We use mathematical tools, coming from statistics and combinatorics, algorithmic tools and molecular physics tools to study basic principles of cellular functioning starting from genomic data. We run several projects in parallel, all aiming at understanding the basic principles of evolution and co-evolution of molecular structures in the cell. They are intimately linked to each other. Applications are in medicine and environment.
Domain annotation and metagenomics - We are developing a new approach to domain annotation that successfully identifies remote homology. Domains are modeled through a multitudes of probabilistic models, contrary to usual approaches based on consensus sequences. This method is now being extended to metagenomic annotation. read more
Transcriptomics and sequence analysis - We combine statistical modeling with combinatorial optimization to provide solutions that address each analysis step of a sequencing experiment with a particular interest on transcriptome sequencing (reconstruction of the transcriptional landscape, enumeration of alternative splicing events). read more
Protein evolution and interactions - Protein-protein interactions are at the heart of the molecular processes that constitute life. We are creating a large scale mapping of PPIs with information at the molecular level. We use sequence- and structure-based bioinformatics methods to predict the conformation of interacting proteins, their interaction sites and also which proteins interact and how strongly. read more
Protein conformational dynamics - We study protein conformational dynamics to predict the effects of disease-associated mutations and to characterize alternative functional conformations to be targeted by drugs. read more
Our methods have multiple applications which play a role in directed mutagenesis, synthetic biology, metagenomics and environment, gene annotation, mutations in genetic diseases.
Improvement in protein domain identification is reached by breaking consensus, with the agreement of many profiles and domain co-occurrence. Bernardes JS, Zaverucha G, Vaquero C, Carbone A. (2016) PLoS Computational Biology. 12: e1005038
We address the fundamental question of domain identification for highly divergent proteins. By using high performance computing, we demonstrate that the limits of state-of-the-art annotation methods can be bypassed. Our strategy is based on the observation that many structural and functional protein constraints are not globally conserved through all species but might be locally conserved in separate clades. We successfully predict at least one domain for 72% of P. falciparum proteins against 63% achieved previously, corresponding to 30% of improvement over the total number of Pfam domain predictions on the whole genome. http://www.lcqb.upmc.fr/CLADE.
The Metagenomics and Metadesign of the Subways and Urban Biomes (MetaSUB) International Consortium inaugural meeting report. MetaSUB International Consortium (Richard H.) (2016) Microbiome. 4: 24
The Metagenomics and Metadesign of the Subways and Urban Biomes (MetaSUB) International Consortium is a novel, interdisciplinary initiative comprised of experts across many fields, including genomics, data analysis, engineering, public health, and architecture. The ultimate goal of the MetaSUB Consortium is to improve city utilization and planning through the detection, measurement, and design of metagenomics within urban environments. We are developing new annotation methods that will be useful for the entire project. The data produced by the consortium can aid city planners, public health officials, and architectural designers and will lead to the discovery of new species, global maps of antimicrobial resistance (AMR) markers, and novel biosynthetic gene clusters (BGCs). Read more on Mapping the subway's microbiome.
JET2 Viewer: a database of predicted multiple, possibly overlapping, protein-protein interaction sites for PDB structures. Ripoche H, Laine E, Ceres N, Carbone A. (2016) Nucleic Acids Research. 45: D236-D242
We report predictions of protein-protein interfaces for the non-redundant set of all protein chains for which a stucture is available in the Protein Data Bank. The predictions were made using JET2 and were evaluated on more than 15 000 experimentally characterized protein interfaces. This is, to our knowledge, the largest evaluation of a protein binding site prediction method. The overall performance of JET2 on all interfaces are: Sen = 52.52, PPV = 51.24, Spe = 80.05, Acc = 75.89. The knowledge base contains more than 20 000 entries and is freely accessible at: http://www.jet2viewer.upmc.fr.
Protein social behaviour makes a stronger signal for partner identification than surface geometry. Laine E, Carbone A. (2017) Proteins. 85: 137-154
We show that characterizing how a protein behaves with many potential interactors in a complete cross-docking study leads to a sharp identification of its cellular/true/native partner(s). We define a sociability index, or S-index, reflecting whether a protein likes or not to pair with other proteins. We show that sociability is an important factor and that the normalization permits to reach a much higher discriminative power than shape complementarity docking scores. The social effect is also observed with more sophisticated docking algorithms. http://www.lcqb.upmc.fr/CCDGeomDock/.