génétique des génomes bactériens [email protected] symplectic biology: universals in...

50
Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected] Symplectic Biology: universals in bacterial genomics 31 july 2006

Upload: merilyn-ferguson

Post on 28-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Symplectic Biology:universals in bacterial genomics

31 july 2006

Page 2: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Symplectic biology: The Delphic Boat

Biology is a science of relationships between objects rather than of objects: from together, , to weaveProteins are part of complexes, as are parts in an engineAs for constructing a boat, failing to understand their relationships will result in ultimate failure of synthetic biology

The Delphic Boat: Harvard University

Press, février 2003

Page 3: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Authors

Génétique des Génomes Bactériens– Gang Fang– Evelyne Krin– Etienne Larsabal– Géraldine Pascal– Eduardo Rocha– Agnieszka Sekowska

Genoscope– Valérie Barbe– Stéphane Cruveiller– Sophie Mangenot– Géraldine Pascal– Zoé Rouy– David Vallenet– Claudine Médigue

Génétique in silico Marc Bailly-Béchet Massimo Vergassola

Abdus Salam International CenterIn Theoretical Physics

Mudassar Iqbal Matteo Marsili

Page 4: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Context

Physics: matter, energy, timeStatistical physics: Physics + informationBiology: Physics + information, coding, control...Arithmetics: sequence of integers, recursivity, coding…Computation: Arithmetics + program + machine...

A metaphor with practical consequences, the genetic program: we know how to manipulate the genes and their products

Page 5: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Three processes are needed for Life:

3. Information transfer (Living Computers?) => the goal of genomics is to decipher the program associated to the machine

Driving force for a coupling between the genome structure and the structure of the cell:1. Metabolism2. Compartmentalisation

The cell is the atom of life, with two strategies: a single envelope (prokaryotes) or multiplication of membrane and skins (eukaryotes); this is correlated with the genome sequence: at first sight prokaryotic genomes look random and eukaryotic genomes look repeated

What is Life?

Page 6: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Information transfer

Replication (law: “complementarity”; concept: “effective”)

Transcription (law: “complementarity”; concept: “constructive”)

Translation (law: a “cypher”, the “genetic code”; concept: “prospective” )

Myhill, J. (1952) Some philosophical implications of mathematical logic. Three classes of ideas. The Review of Metaphysics 6 : 165-198.

Process

Page 7: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

The action process

Replication, transcription, translation: high parallelism

“Beginning, Repeated Routine and Check Points, End”

The action is always oriented, with a beginning and an end

The control process of Check Points is rarely taken into account in present research (except in replication/division), but its role is essential to permit coordination of multiple actions in parallel

Machines…

Page 8: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Two processes are needed for computing:

A read/write machine

A program on a physical support (typically, a tape illustrates the sequential string of symbols that makes up the program), split (in practice) into two entities:

Program (providing the goal)Data (providing the context)

The machine is distinct from the program

What is computing?

Page 9: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Cells as computers

Genetic studies rest on the description of genomes as texts written with a four letters alphabet: do cells behave as computers?

Horizontal Gene Transfer

Virus

Genetic engineering => reconstruction of the hepatitis C virus

Animal cloning

all point to separation between

A « Machine » (cell factory)

and

Data + Programme

Turing

Page 10: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

If the machine has not only to behave as a computer but has also to construct the machine itself, one must find an image of the machine somewhere in the machine (John von Neumann)

Is there a map of the cell in the chromosome?

Page 11: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Drosophiloculus,

Homunculus ?

Transition

Page 12: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Genome organisation

Is the gene order random in the chromosomes?

At first sight, consistent with different DNA management processes not much is conserved, and genes transferred from other organisms are distributed throughout genomes

However, groups of genes such as operons or pathogenicity islands tend to cluster in specific places, and they code for proteins with common functions. « Persistent » genes are clustered together

Also, some motifs are ubiquitously present, suggesting general rules constraining genome organisation

E Larsabal, A DanchinGenomes are covered with ubiquitous 11bp periodic patterns, the "class A flexible patterns"BMC Bioinformatics (2005) 6: 206

Page 13: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Universal Motifs

Larsabal E, Danchin A.Genomes are covered with ubiquitous 11 bp periodic patterns, the "class A flexible patterns »BMC Bioinformatics. 2005 6:206.

Page 14: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

A universal rule: class A flexible patterns

The flexible nature of the patterns permits DNA to accomodate superturns or local bending

Page 15: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

A universal feature of the program: the period of 10-11.5

motifs

0 10 20 30 40 50 60 70 80 90 100bp

0 10 20 30 40 50 60 70 80 90 100bp

ffss(G(G--))-0.01

0

0.01

0 10 20 30 40 50 60 70 80 90 100bp

real

model

difference

Helicobacter pylori

Page 16: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Flexible motifs of type A

methods

1-x1-xAAxxxxxxxxTTxxxxxxxxAAxxxxxxxxTTTTxxxxxxxxxxAAxxxxxxxxTTxxxxxxxxAAxxxxxx: : All kindomsAll kindoms 2-xxxxxxxxxxx2-xxxxxxxxxxxGGxxxxxxxxTTTTxxxxxxCCxxxxxxxxxxTTxxxxxxxxxxxxxxxxxx: : ProteobacteriaProteobacteria 4-xxxxxx4-xxxxxxTTxxxxxxxxAGAGxxxxxxTTTTxxxxxxxxxxxxxxxxTTxxxxxxxxxxxxxxxxxxxx: : ArchaeaArchaea 55''--xxx-10xxxxxxxxx0xxxxxxxx10xxxxxxxxx-10xxxxxxxxx0xxxxxxxx10xxxxxxbp-3bp-3''

TTTTxxxxxxGGxxxxxxTTxxxxxxxxxxxxxxxxxxxxTTTT

The nucleotides composing this class A flexible pattern are accessible through this side too but the dinucleotides are set in minor grooves.

The nucleotides composing this class A flexible pattern are fully accessible through this side and the dinucleotides are set in major grooves.

TTTT

GG TT TTTT

AAAACC

AA AAAA

Page 17: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

From the leading strand to the lagging

strand

Page 18: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

180

90

0

27055% leading

Escherichia coli

Ori

Ter

9027065% leading

Treponema pallidum

Ori

Ter

180

9027075% leading

Bacillus subtilis

Ori

Ter

9027087% leading

Thermoanaerobacter tengcongensis

Ori

Ter

CDSs density

Leading CDSs density

Page 19: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

To lead or to lag...

Is it possible to see whether the position of genes in the chromosome is randomly distributed on the leading and lagging strand?

Page 20: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Chosing arbitrarily an origin of replication and a property of the strand (base composition, codon composition, codon usage, amino acid composition of the coded protein…) one can use discriminant analysis to see whether the hypothesis holds.

To lag or to lead...

E. Rocha, A. Danchin & A. Viari Universal replication biases in bacteria. Mol. Microbiol. (1999) 32: 11-16

Page 21: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

That is the question…

.

0,45

0,5

0,55

0,6

0,65

0,7

0,75

0,80,85

0 20 40 60 80 100

Bacillussubtilis

acc

ura

cy

Borreliaburgdorferi

0,4

0,5

0,6

0,7

0,8

0,9

1

0 20 40 60 80 1000,4

0,5

0,6

0,7

0,8

0,9

1

Chlamydiatrachomatis

0 20 40 60 80 100

0,45

0,5

0,55

0,6

0,65

0,7

0,75

0 20 40 60 80 100

Escherichiacoli

acc

ura

cy

0,45

0,5

0,55

0,6

0,65

0,7

0,75

0 20 40 60 80 100

Heamophilusinfluenzae

0 20 40 60 80 100

HelicobacterPylori

0,4

0,45

0,5

0,55

0,6

0,65

0,7

0,4

0,45

0,5

0,55

0,6

0,65

0,7

0,750,8

0 20 40 60 80 100

Méthanobacteriumthermoautotrophicum

position (%) position (%) position (%)

acc

ura

cy

0,45

0,5

0,55

0,6

0,65

0,7

0,75

0 20 40 60 80 100

Mycobacteriumtuberculosis

0,4

0,5

0,6

0,7

0,8

0,9

1

0 20 40 60 80 100

Treponemapallidum

Bases

Amino acids

Codons

Dinucleotides

Page 22: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Visible in proteins…

.

0 5 10 150

5

10

15B. burgdorferi

% Valine

% T h r e o n i n e

I

0

5

10

15

20

25B. burgdorferi

0 5 10 15

% I s o l e u c i n e% Valine

GT on the leading strand, CA on the lagging strand...

Page 23: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Essential genes in Bacillus subtilis: function

enters

highlyexpressed

0%

25%

50%

75%

100%

non-highlyexpressed

essential genes non-essential genes

lagging

leading

non-highlyexpressed

highlyexpressed

Rocha EP, Danchin A. Essentiality, not expressiveness, drives gene-strand bias in bacteriaNature Genetics. 2003 34:377-378.

Page 24: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

When polymerases collide

DNAPdeceleration

End oftranscription

Arrest of RNAP & DNAP

Transcriptionabortion

Co-oriented

Head-on

Consequences:1. Replication slow-down

2. Loss of transcripts

Consequences:1. Aborted transcripts

2. Truncated essential proteins

Page 25: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

From function to structure, not vice

versa

Page 26: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

In 1991, at the EU meeting on genome programs in Elounda, Greece, the presentation of the yeast chromosome III and the first 100 kb of the Bacillus subtilis genome revealed that, contrary to expectation (the only cases where this had been observed were phages, for obvious reasons), at least half of the genes uncovered were totally unknown, whether in structure or in function

Among reasons for that is our present lack of deep knowledge of metabolism, as well as our lack of knowledge of the way new genes are created, selecting function first, then recruiting a structure that will be improved as it is submitted to natural selection for increased fitness of its host (acquisitive evolution)

The first discovery of genomics

Page 27: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

2045 ongoing projects, 348 completed, mostly from microbes (228 with more than 1500 genes, more or less correctly annotated)

144,116,054,623 nucleotides at International Nucleotide Sequence Database Collaboration (INSDC)

Microbes make 50% of the Earth protoplasm40-50% coding DNA sequences (CDSs) do not

correspond to known functions; 10% correspond to the core genome ( « persistent » genes)

Genome projects

Page 28: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

The darwinian trio

Variation / Selection / Amplification Stabilisation

Evolution creates

Function captures (recruits)

Structure code

Sequence

Page 29: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

What functions for life?Extending Cuvier’s vision

Root function and helper functions [the root function of a printer is “printing”, “feeding paper”, “supplying energy” are helper functions]

To be — to persist in time — can be proposed as the root function of living organisms

Self-consistence implies correlation of forms

Fighting weathering implies chemical turnover (metabolism) and protection (compartmentalisation)

Exploration, associated to sensing and memorizing is the discovery that made life as we know it

causeries/causeries.html

Page 30: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

exploration

sensing

representation (memory)

Compartmentalisation Metabolism Information transfer

energisation replication

shaping

making

an

envelope

making

a

skeleton

making

appendages

construction

of biomass

precursors

transcription

phospholipid and envelope biosynthesis translation

transport degradation editing

circulation (chanelling) salvage folding/scaffolding

protection cleaning control

partitioning inactivation

storage maintenance (repair, degradation)

modification (labelling, maturation,

addressing, stabilisation, protection,

control)

What functions for life?

Page 31: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

From sequence to function?

Inductive reasoning combines sequence data and in silico predictions, that are tested using expression profiling (transcriptome and proteome) as well as many other « neighborhoods », such as amino acid composition, isolectric point in proteins or codon usage biases in genes

One notes that regulation evolves much faster than any other process: in the long run the structural genes are the most important ones

Page 32: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Three examples of the role of the context

Microbial genes are of infinite diversity but there exists universals; only about 10% of their genes are of persistent and recognized function; we do not have yet a fair idea of the number of microbial species; the number of genes in a given species is highly variable (horizontal gene transfer)

Example 1: persistent genes Example 2: orphan genes and universal amino acids [Example 3: a new metabolic pathway] ....

Page 33: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Gene Persistence

Page 34: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Some of the essential genes missing from the list of persistent genes have diverged considerably. To assess the contribution of this effect we measured for each pair of genomes the correlation between the similarity of orthologous pairs and that of the 16S rRNA. The correlations were high. For example (A), 38% (resp. 48%) of B. subtilis (resp. E. coli) persistent genes showed a correlation coefficient >0.9 between the sequence similarity of the pair of orthologs and the 16S RNA.

In contrast, some genes (B) evolve in an erratic way. This may be due to horizontal gene transfer, local adaptations leading to faster or slower evolutionary pace, or simply wrong assignments of orthology. The latter can be a significant problem, especially in large protein families. The genes presenting such an erratic pattern are seldom found in the persistent set.

Gene persistence

G Fang, EPC Rocha, A DanchinHow essential are non-essential genes?Mol Biol Evol (2005) 22: 2147-2156

Page 35: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Genomic islands

A clustering method based on the analysis of codon usage biases using an information theory groups the genes into homogeneous clusters, which are not distributed randomly in the chromosome. The method allows finding both the specific codon usage bias in a class and the most relevant number of classes (4 for E. coli and 5 for B. subtilis). One cluster is related to expression levels. Other groups feature an over-representation of genes belonging to different functional groups: horizontally transferred genes, motility and intermediary metabolism.

M. Bailly-Béchet

M Bailly-Bechet, A Danchin, M Iqbal, M Marsili, M VergassolaCodon usage domains over bacterial chromosomesPLoS Computational Biology (2006) 2: april 20th

Page 36: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

What functions for life?Scenario for the origin of life

To be — to persist in time — can be proposed as the root function of living organisms

Fighting weathering implies chemical turnover (metabolism) on solid surfaces and immobility requires protection (compartmentalisation)

Compartementalised metabolism creates surface substitutes (RNA)

Exploration, associated to sensing and memorizing (information transfer) is the discovery that made life as we know it

causeries/causeries.html

A DanchinHomeotopic transformation and the origin of translation Progress in Biophysics and Molecular Biology (1989) 54: 81-86

Page 37: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Persistent genes organisation recapitulates the origin of life

Using 228 genomes, we have identified genes that tend to remain close to one another; this « mutual attraction » constructs a remarkable network made of three concentric circles

The external network, made from genes of intermediary metabolism, is highly fragmented; the middle network has tARN synthetases at its core, et le internal network, almost continuous makes the core of information transfer around the ribosome, transcription and replication G. Fang

Page 38: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Thank you

Page 39: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Coping with cold and aging:Lessons from Pseudoalteromonas

haloplanktis

1 august 2006

Page 40: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Authors

Génétique des Génomes BactériensGang FangEvelyne KrinEtienne LarsabalGéraldine PascalEduardo RochaAgnieszka Sekowska

GenoscopeValérie BarbeStéphane CruveillerSophie MangenotGéraldine PascalZoé RouyDavid VallenetClaudine Médigue

University of Hong Kong Christine Ho Frankie Cheung

University of Liège Georges FellerUniversity of Naples Luisa TutinoUniversity of Strasbourg Philippe BertinUniversity of Stockholm Gunnar von Heijne

Page 41: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Ecological neighbourhood:

growth in the cold

Page 42: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Challenges posed by cold

Protein folding

RNA folding

Membrane fluidity

Sensitivity towards oxygen and radicals

Distribution of aminoacids in proteins...

=> A series of unexpected answers

folding

Page 43: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Genome Organisation

P. haloplanktis

origins

Page 44: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Neighbourhood: distribution of aminoacids in the proteome

G. Pascal

Bias in amino acid distribution

Page 45: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Universal biases in protein amino acid composition

First axis: separates Integral Inner Membrane Proteins (IIMP) from the rest; driven by opposition between charged and large hydrophobic residues

Second axis: separates proteins according to an opposition driven by the G+C content of the first codon base

Third axis: separates proteins by their content in aromatic amino acids; enriched in orphan proteins

Page 46: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Temperature-dependent biases in protein amino

acid composition

• The general trend of amino acid composition bias is to avoid some aminoacids at higher temperatures (associated to aging processes)• Mesophilic bacteria belong to at least two different classes (in a 5-clusters analysis)• Biases are always dominated by the IIMP clustering

C Médigue, E Krin, G Pascal, V Barbe, A Bernsel, PN Bertin, F Cheung, S Cruveiller, S D'Amico, A Duilio, G Fang, G Feller, C Ho, S Mangenot, G Marino, J Nilsson, E Parrilli, EPC Rocha, Z Rouy, A Sekowska, ML Tutino, D Vallenet, G von Heijne, A DanchinCoping with cold: the genome of the versatile marine Antarctica bacterium Pseudoalteromonas haloplanktis TAC125Genome Research (2005) 15: 1325-1335

Page 47: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

An asparagine bias is specific to psychrophiles

Biases correlated with temperature

IIMPsIIMPs

53% 53% mesophilesmesophiles

62% 62% thermophilesthermophiles

55% psychrophiles

55% psychrophiles

Motility

Envelope, outer membrane

Transport (TonB), secretion

Adaptation to stress

DNA and RAN metabolism

isoaspartate

G. Pascal

Page 48: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Orphans: the gluons

A remarkable role of aromatic amino acids creates a universal bias. Expressed orphan proteins are enriched in these residues, suggesting that they might participate in a process of gain of function during evolution. We postulate that the majority is made of proteins — gluons — involved in stabilising complexes, thus defining the "self" of the species.

G Pascal, C Médigue, A DanchinUniversal biases in protein composition of model prokaryotesProteins (2005) 60: 27-35

Page 49: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Why aromatic residues in orphan proteins?

From orphans « Gluons »

Orphan loose their status in the course of evolution: Rocha. 2002. Pedulla. 2003

G. Pascal

Page 50: Génétique des Génomes Bactériens  adanchin@pasteur.fr Symplectic Biology: universals in bacterial genomics 31

Génétique des Génomes Bactériens http://www.pasteur.fr/recherche/unites/REG/ [email protected]

Thank you