You are here

Analytical Genomics


The group works on various problems connected with the functioning and evolution of biological systems. We use mathematical tools, coming from statistics and combinatorics, algorithmic tools and molecular physics tools to study basic principles of cellular functioning starting from genomic data. We run several projects in parallel, all aiming at understanding the basic principles of evolution and co-evolution of molecular structures in the cell. They are intimately linked to each other.

Four projects concern protein evolution and the development of bioinformatics and molecular modeling tools for the detection of :

1. distantly related proteins. We develop novel computational approaches to annotation based on sequence and protein family learning.
2. networks of co-evolved and/or dynamically correlated residues. On the one hand a fine combinatorial analysis of phylogenetic trees leads to reconstruct networks of co-evolved residues from sequence analysis. On the other hand a thorough characterization of inter-residue dynamical correlations enables to detect communication pathways across protein structures. Combining these approaches we aim at predicting interaction sites, mechanical and allosteric properties, folding pathways in proteins.
3. functional sites on protein complexes and detection of potential protein partners. We combine evolutionary information (how evolution modified proteins to enhance their function) and molecular modeling (computational determination of the relative position of two interacting protein partners) to identify potential interactions.
4. alternative functional conformations of proteins. We develop and apply methods to describe the complex transitions of biomolecules, with the aim of predicting alternative conformations that play a functional role and that are suitable for drug targeting.

Four projects concern sequence evolution:

5. in microbial organisms: essential genes, synthetic biology and genome evolution. We extract information concerning environmental organization and essential metabolic networks from codon bias analysis. We aim at a metabolic network reconstruction of metagenomics sampling, genome synthesis, and modeling evolvability of gene expression under changes of environmental conditions.
6. in eukaryotic organisms: ab initio detection of miRNAs. We work on a novel ab initio approach to discover miRNA organized in clusters. The functional organisation of miRNA clusters is studied.
7. reconstruction of ancestral genomes and of chromosomal rearrangements dynamics. We develop general methods for reconstructing ancestral genomes and the history of the rearrangements. We focus on the phylogenetic tree of Lachancea genus.
8. statistical methods for transcriptome analysis by deep sequencing. New sequencing technologies enable profiling the transcriptome of given cell types with unprecedented precision. We develop methods for detection and estimation of transcript expression levels.

Applications are multiple and play a role in directed mutagenesis, synthetic biology, metagenomic data organisation, gene annotation

Selected Publications
Shrestha AMS, Asai K, Frith M, Richard H. Jointly aligning a group of DNA reads improves accuracy of identifying large deletions. Nucleic Acids Research. 46(3), (2018).
Raucci R, Laine E, Carbone A*. Local Interaction Signal Analysis predicts protein-protein binding affinity. Structure. (2018).
Ugarte A, Vicedomini R, Bernardes JS, Carbone A*. A multi-source domain annotation pipeline for quantitative metagenomic and metatranscriptomic functional profiling. Microbiome. (2018).
Laine E, Carbone A*. Protein social behaviour makes a stronger signal for partner identification than surface geometry. Proteins. 85(1), pp.137-154 (2017).
Nadalin F, Carbone A*. Protein-protein interaction specificity is captured by contact preferences and interface composition. Bioinformatics. (2017).
Oteri F, Nadalin F, Champeimont R, Carbone A*. BIS2Analyzer: a server for coevolution analysis of conserved protein families. Nucleic Acids Research. (2017).
Champeimont R, Laine E, Hu S-W, Penin F, Carbone A*. Coevolution analysis of Hepatitis C virus genome to identify the structural and functional dependency network of viral proteins. Scientific Reports. 6, (2016).
Bernardes JS, Zaverucha G, Vaquero C, Carbone A*. Improvement in protein domain identification is reached by breaking consensus, with the agreement of many profiles and domain co-occurrence. PLoS Computational Biology. 12(7), (2016).
Ripoche H, Laine E, Ceres N, Carbone A*. JET2 Viewer: a database of predicted multiple, possibly overlapping, protein-protein interaction sites for PDB structures. Nucleic Acids Research. (2016).
Laine E, Carbone A*. Local Geometry and Evolutionary Conservation of Protein Surfaces Reveal the Multiple Recognition Patches in Protein-Protein Interactions. PLoS Comput Biol. 11(12), pp.e1004580 (2015).
Bernardes JS, Vieira FRJ, Zaverucha G, Carbone A*. A multi-objective optimization approach accurately resolves protein domain architectures. Bioinformatics. (2015).
Mirauta B, Nicolas P, Richard H. Parseq: reconstruction of microbial transcription landscape from RNA-Seq read counts using state-space models. Bioinformatics. 30(10), pp.1409-16 (2014).
Drillon G, Carbone A*, Fischer G. SynChro: a fast and easy tool to reconstruct and visualize synteny blocks along eukaryotic chromosomes. PLoS One. 9(3), pp.e92621 (2014).
Schulz MH, Weese D, Holtgrewe M, Dimitrova V, Niu S, Reinert K, Richard H. Fiona: a parallel and automatic strategy for read error correction. Bioinformatics. 30(17), pp.i356-63 (2014).
Mathelier A, Carbone A*. Large scale chromosomal mapping of human microRNA structural clusters. Nucleic Acids Res. 41(8), pp.4392-408 (2013).
Lopes A, Sacquin-Mora S, Dimitrova V, Laine E, Ponty Y, Carbone A*. Protein-protein interactions in a crowded environment: an analysis via cross-docking simulations and evolutionary information. PLoS Comput Biol. 9(12), pp.e1003369 (2013).
Jobs & Internships

Open Positions