COMmunication MApping

A method to dissect proteins dynamical architectures.

Overview

Proteins adapt to environmental conditions by changing their shape and motions. Characterising protein conformational dynamics is increasingly recognised as necessary to understand how proteins function. Given a conformational ensemble, computational tools are needed to extract in a systematic way pertinent and comprehensive biological information. Communication Mapping (COMMA) is a new method to decipher the dynamical architecture of a protein. The method first extracts residue-based dynamic properties from all-atom molecular dynamics simulations. Then, it integrates them in a graph theoretic framework, where it identifies groups of residues or protein regions that mediate allosteric communication. COMMA introduces original concepts to contrast the different roles played by these regions, namely communication blocks and communicating segment pairs, and evaluates the connections and communication strengths between them. Our method permits to compare in a direct way the dynamical behaviour either of proteins with different characteristics or of the same protein in different conditions. COMMA is a fully automated tool with broad applicability and is freely available to the community.

Download

The COMMA package is available here.

The results of COMMA on 4 studied systems (Protein A, P53, KIT wild-type and KIT D816V)

are downloadable here.

System requirements

Platform independent.

The program COMMA requires some external tools that should be installed:

- Eigen, a c++ library for linear algebra.
-MDTraj, a python library to analyse MD simulation trajectories.
McGibbon RT, Beauchamp KA, Harrigan MP, Klein C, Swails JM, Hernández CX, Schwantes CR, Wang LP, Lane TJ, Pande VS. (2015) Biophysical journal., 109:1528-1532
-Numpy python package
-PyMol (external tool).
DeLano WL. (2002), The pymol molecular graphics system.

Installation

1. Unzip the COMMA.zip in the directory of your choice.
2. Add COMMA_HOME to system variables, it should point to the COMMA directory.

Instructions for running COMMA

The instructions for running COMMA can be found in the README file.

The following input files for COMMA have to be all in the same folder:

- the structure in PDB format (.pdb), with the numbering of residues that should start from one.
- all the trajectories (PDB, XTC, TRR, DCD, binpos, NetCDF or MDTraj HDF5 format)

An example of a command line to execute COMMA is:

COMMA will then ask whether to perform Pre-processing on the MD trajectories or not (yes or no). If parameters are already computed (ex. INT matrix, CP matrix, ...), print no.

After the execution, in the specified data path, 2 folders will be created by COMMA, 1) Parameters, where all the parameters measured/used by COMMA are placed in and 2) Results, that contains all the generated outputs: adjacency matrices for the PCN and pathway-based and clique-based connected components, all pairs of communicating segments, their direct contact and a list of their strenghts and the average structure for each replicate.

Two pml scripts are generated in data directory for the system:

1. StructureName_ComBlocks.pml: to visualize all connected components and Cliques on the structure in pymol.
2. StructureName_SegPairs.pml: to visualize all pairs of communicating segments and their direct contact in pymol.

$COMMA_HOME/COMMA "/home/yasaman/Desktop/Data/1BDD" "1BDD_20_50.pdb" "1BDD_20_50.binpos" "1BDD_20_50_0.binpos"

Licence

The COMMA package has been developed under the CeCILL licence (see LICENCE).

Contact

For questions, comments or suggestions feel free to contact Yasaman Karami, Elodie Laine

or Alessandra Carbone.

Reference

If you use COMMA, please cite:

Y. Karami, E. Laine and A. Carbone, Dissecting protein architecture with communication blocks and communicating segment pairs, BMC bioinformatics 17.2 (2016).