mesurer les associations protéiques à proximité in …...mesurer les associations protéiques à...

63
Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire Andrée-Ève Chrétien Maîtrise en biologie Maître ès sciences (M. Sc.) Québec, Canada © Andrée-Ève Chrétien, 2017

Upload: others

Post on 21-May-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

Mesurer les associations proteacuteiques agrave proximiteacute in vivo en utilisant la compleacutementation de fragments proteacuteiques

Meacutemoire

Andreacutee-Egraveve Chreacutetien

Maicirctrise en biologie

Maicirctre egraves sciences (M Sc)

Queacutebec Canada

copy Andreacutee-Egraveve Chreacutetien 2017

Mesurer les associations proteacuteiques agrave proximiteacute in vivo en utilisant la compleacutementation de fragments proteacuteiques

Meacutemoire

Andreacutee-Egraveve Chreacutetien

Sous la direction de

Christian Landry directeur de recherche

III

Reacutesumeacute

Les interactions proteacuteine-proteacuteine (PPI) sont agrave la base du fonctionnement cellulaire de tous

les organismes Regroupeacutees en deux cateacutegories les meacutethodes pour eacutetudier les PPI permettent

soit drsquoidentifier les proteacuteines composant le complexe soit de deacuteterminer les relations entre

les proteacuteines Il existe peu de meacutethodes hybrides permettant drsquoobtenir ces deux informations

et ces meacutethodes comportent plusieurs limitations Le but de ce projet eacutetait de deacutevelopper une

nouvelle meacutethode hybride en modifiant la compleacutementation de fragments proteacuteiques (DHFR

PCA) chez la levure Saccharomyces cerevisiae Le principe de la DHFR PCA repose sur

lrsquoassociation de deux fragments rapporteurs compleacutementaires en preacutesence drsquoune interaction

proteacuteine-proteacuteine Les fragments rapporteurs sont fusionneacutes aux proteacuteines via un connecteur

peptidique La longueur du connecteur limite la distance maximale agrave laquelle il est possible

de deacutetecter une interaction entre deux proteacuteines Notre hypothegravese eacutetait qursquoen augmentant la

longueur du connecteur nous serions en mesure de deacutetecter des interactions plus eacuteloigneacutees

Nous avons drsquoabord veacuterifieacute que lrsquoaugmentation de la longueur du connecteur permettait de

modifier notre capaciteacute agrave deacutetecter des interactions sans toutefois perdre la speacutecificiteacute de la

meacutethode De nouvelles interactions ont eacuteteacute deacutetecteacutees agrave lrsquointeacuterieur drsquoun mecircme complexe

proteacuteique et entre deux complexes Nous avons ensuite valideacute notre capaciteacute agrave mieux

disseacutequer lrsquoarchitecture des complexes proteacuteiques en approfondissant le cas de cinq

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de longueurs de connecteurs Enfin

nous avons confirmeacute que la meacutethode permettait effectivement de deacutetecter des interactions

entre proteacuteines plus distantes en comparant les reacutesultats obtenus aux distances calculeacutees agrave

partir des structures du proteacuteasome disponibles La variation apporteacutee agrave la DHFR PCA

permet de moduler la reacutesolution de lrsquoeacutetude des PPI et ainsi de mieux deacutefinir lrsquoarchitecture

des complexes proteacuteiques

IV

Abstract

Protein-protein interactions (PPI) are central to all cellular processes in all organisms

Grouped in two categories methods to study PPI allow either to identify proteins composing

protein complexes or to determine relationships between proteins Only a few hybrid methods

can be used to obtain both of those informations and these methods present many limitations

The goal of this project was to develop a new hybrid method by modifying the Protein-

fragment complementation assay (DHFR PCA) in the yeast Saccharomyces cerevisiae

DHFR PCA is based on the association of two complementary reporter fragments in presence

of an interaction Both fragments are fused to proteins with a peptide linker Linker length

limits the maximal distance at which it is possible to detect an interaction between two

proteins Our hypothesis was that increased linker length would allow the detection of more

distant interactions We first verified if the augmentation of linker length modified our

capacity to detect interactions without losing specificity New interactions were detected

inside and between complexes Then we validated our capacity to better dissect protein

complexes architecture by studying five protein complexes with different linker length

combinations Finally we confirmed that the method allowed the detection of interactions

that were further in space by comparing our results with distances calculated with available

proteasome structures This variation of DHFR PCA allows to modulate the resolution of PPI

study and thus better define protein complexes architecture

V

Table des matiegraveres

Reacutesumeacute III

Abstract IV

Table des matiegraveres V

Liste des tableaux VII

Listes des figures VIII

Listes des abreacuteviations IX

Remerciements XI

Avant-propos XIII

Introduction geacuteneacuterale 1

11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine 1

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine 2

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions proteacuteine-proteacuteine 3

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification de complexes

proteacuteiques suivie de la spectromeacutetrie de masse 4

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques 5

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine 8

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour moduler la deacutetection des

interactions proteacuteine-proteacuteine 9

16 Objectifs de recherche 9

Measuring proximate protein association in living cells using Protein-fragment complementation

assay (PCA) 11

Reacutesumeacute 11

Abstract 12

Introduction 13

Material and Methods 14

Yeast 14

Bacteria 15

Plasmid construction 15

Strain construction 16

Estimation of protein abundance 16

Protein-fragment complementation assays 17

VI

PCA images and statistical analyses 19

Analysis of protein distances within complexes 21

Results and discussion 22

Longer linkers increase signal-to-noise ratio in large-scale screens 22

PCA signal reflects the super-organization of protein complexes 23

Longer linkers allow detection of more distant proteins in complexes 25

Conclusion 26

Acknowledgements 26

Conclusion geacuteneacuterale 43

Bibliographie 46

VII

Liste des tableaux

Table S1A Description of the strains constructed and used for this study 30

Table S1B PCA data for global PCA experiment 30

Table S1C PCA data for intra-complexes experiment 30

Table S1D PCR primers used in this study 30

Table S2A Distances between C-termini calculated from molecular modeling 31

Table S2B Identity between each RNApol structures and the experimental sequences 32

Table S2C Identity between proteasome structure and the experimental sequence 34

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I II

and III and proteasome structures 37

VIII

Listes des figures

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization of

protein complexes 27

Figure 2 Longer linkers allow for the detection of more distant proteins within complexes

29

Figure S1 Data related to the PCA experiments 40

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins 42

IX

Listes des abreacuteviations

Pourcentage

degC Degreacute Celsius

Aring Aringngstroumlm

ADN Acide deacutesoxyribonucleacuteique

Amp Ampicilline

ARNm Acide ribonucleacuteique messager

BioID laquo Proximity-dependent biotinylation raquo

ClonNAT Nourseacuteothricine

COG laquo Conserved oligomeric Golgi raquo

DHFR Dihydrofolate reacuteductase

DMSO Dimeacutethylsulfoxyde

F[12] Fragment 12 de la DHFR

F[3] Fragment 3 de la DHFR

FDR Valeur P corrigeacutee

FRET Transfert drsquoeacutenergie entre moleacutecules fluorescentes

g Gramme

Gly ou G Glycine

h Heure

HygB Hygromycine B

Is Score drsquointeraction

L Litre

Log Logarithme

M Molaire

Min Minute

mL Millilitre

mM Millimolaire

MS Spectromeacutetrie de masse

MSMS Spectromeacutetrie de masse en tandem

MTX Meacutethotrexate

MYTH laquo Membrane yeast two-hybrid raquo

X

NaCl Chlorure de sodium

NMR Reacutesonance magneacutetique nucleacuteaire

OD Densiteacute optique

PBS Tampon phosphate salin

PCA Compleacutementation de fragments proteacuteiques

PCR Reacuteaction en chaicircne de polymeacuterisation

PKA Proteacuteine kinase A

PPI Interaction proteacuteine-proteacuteine

Q1 Quartile 1

Q3 Quartile 3

r Coefficient de correacutelation

RNApol ARN polymeacuterase

Sdb Deacuteviation standard

Ser ou S Seacuterine

SDS Sodium dodeacutecyl sulfate

SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate

t-test Test de Student

YPD Extrait de levures peptone dextrose

Y2H Double hybride

Zs Score Z

microb Moyenne estimeacutee

microg Microgramme

microL Microlitre

microM Micromolaire

2YT 2 extraits de levures tryptone

2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

XI

Remerciements

Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens

sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon

directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur

de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les

moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes

de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert

des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles

Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr

Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet

Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux

professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la

science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur

disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite

eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur

excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son

entraide sa disponibiliteacute et les discussions entraicircnantes

Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes

supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme

un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une

compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les

rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant

au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave

Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son

sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et

Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller

lors de difficulteacutes

XII

Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis

Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni

non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes

mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire

de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des

responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu

deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un

support moral

Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc

est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son

savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout

moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes

craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour

deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et

professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours

reconnaissante Merci mon amour merci pour tout

XIII

Avant-propos

Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui

sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant

de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application

pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec

Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute

(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de

ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au

baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La

reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et

Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites

conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme

Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee

dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of

protein interactomes reveals their mechanisms of regulation robustness and insights into

genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau

(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette

(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger

(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce

meacutemoire

1

Introduction geacuteneacuterale

11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine

Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du

vivant Leurs associations temporaires ou permanentes sont au cœur des voies de

signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent

interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les

interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les

interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la

cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le

maintien des fonctions cellulaires

Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les

processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination

spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies

comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes

catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute

de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et

reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres

plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le

vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est

relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)

En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre

proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun

complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un

contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association

entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence

drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines

individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe

proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines

obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes

2

est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux

sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie

(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une

cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple

(14-16)

Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-

proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau

drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour

comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute

Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment

lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)

Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces

cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en

consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex

environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles

ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en

perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur

lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel

environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer

une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine

Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la

meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la

comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du

trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees

preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce

complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et

controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et

collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite

divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation

3

de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet

drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa

le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute

systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le

reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le

recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces

complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait

fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait

complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines

essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues

pour la robustesse (31)

Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux

meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe

proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber

seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a

deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major

Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les

infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les

bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe

srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies

mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait

responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les

reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes

perturbations (36)

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions

proteacuteine-proteacuteine

Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont

eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent

toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-

ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre

4

diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition

des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions

physiques entre deux proteacuteines

La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique

soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la

spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de

meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-

hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment

complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est

tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal

lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte

eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes

(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo

(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui

permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans

qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc

agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces

meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent

des informations sur les relations spatiales entre les proteacuteines

Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun

cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles

maintiennent ensemble

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification

de complexes proteacuteiques suivie de la spectromeacutetrie de masse

La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une

meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs

techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie

drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun

extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par

un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees

5

afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite

les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides

et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison

des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines

retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de

masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et

fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre

additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe

drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la

seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal

inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de

leur eacutetude (43)

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques

1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de

fragments proteacuteiques

La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments

rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les

deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs

srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal

Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute

permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique

Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du

galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de

plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune

part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins

des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines

membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part

puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre

localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines

Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle

6

neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est

par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou

lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore

largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain

dans un modegravele plus simple (51)

En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave

laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les

proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute

activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-

ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires

insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la

Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-

50 52 53)

La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct

que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute

cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements

deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments

rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme

proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves

lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme

rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la

cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance

cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN

(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-

agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique

possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines

eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la

localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene

ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)

Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 2: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

Mesurer les associations proteacuteiques agrave proximiteacute in vivo en utilisant la compleacutementation de fragments proteacuteiques

Meacutemoire

Andreacutee-Egraveve Chreacutetien

Sous la direction de

Christian Landry directeur de recherche

III

Reacutesumeacute

Les interactions proteacuteine-proteacuteine (PPI) sont agrave la base du fonctionnement cellulaire de tous

les organismes Regroupeacutees en deux cateacutegories les meacutethodes pour eacutetudier les PPI permettent

soit drsquoidentifier les proteacuteines composant le complexe soit de deacuteterminer les relations entre

les proteacuteines Il existe peu de meacutethodes hybrides permettant drsquoobtenir ces deux informations

et ces meacutethodes comportent plusieurs limitations Le but de ce projet eacutetait de deacutevelopper une

nouvelle meacutethode hybride en modifiant la compleacutementation de fragments proteacuteiques (DHFR

PCA) chez la levure Saccharomyces cerevisiae Le principe de la DHFR PCA repose sur

lrsquoassociation de deux fragments rapporteurs compleacutementaires en preacutesence drsquoune interaction

proteacuteine-proteacuteine Les fragments rapporteurs sont fusionneacutes aux proteacuteines via un connecteur

peptidique La longueur du connecteur limite la distance maximale agrave laquelle il est possible

de deacutetecter une interaction entre deux proteacuteines Notre hypothegravese eacutetait qursquoen augmentant la

longueur du connecteur nous serions en mesure de deacutetecter des interactions plus eacuteloigneacutees

Nous avons drsquoabord veacuterifieacute que lrsquoaugmentation de la longueur du connecteur permettait de

modifier notre capaciteacute agrave deacutetecter des interactions sans toutefois perdre la speacutecificiteacute de la

meacutethode De nouvelles interactions ont eacuteteacute deacutetecteacutees agrave lrsquointeacuterieur drsquoun mecircme complexe

proteacuteique et entre deux complexes Nous avons ensuite valideacute notre capaciteacute agrave mieux

disseacutequer lrsquoarchitecture des complexes proteacuteiques en approfondissant le cas de cinq

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de longueurs de connecteurs Enfin

nous avons confirmeacute que la meacutethode permettait effectivement de deacutetecter des interactions

entre proteacuteines plus distantes en comparant les reacutesultats obtenus aux distances calculeacutees agrave

partir des structures du proteacuteasome disponibles La variation apporteacutee agrave la DHFR PCA

permet de moduler la reacutesolution de lrsquoeacutetude des PPI et ainsi de mieux deacutefinir lrsquoarchitecture

des complexes proteacuteiques

IV

Abstract

Protein-protein interactions (PPI) are central to all cellular processes in all organisms

Grouped in two categories methods to study PPI allow either to identify proteins composing

protein complexes or to determine relationships between proteins Only a few hybrid methods

can be used to obtain both of those informations and these methods present many limitations

The goal of this project was to develop a new hybrid method by modifying the Protein-

fragment complementation assay (DHFR PCA) in the yeast Saccharomyces cerevisiae

DHFR PCA is based on the association of two complementary reporter fragments in presence

of an interaction Both fragments are fused to proteins with a peptide linker Linker length

limits the maximal distance at which it is possible to detect an interaction between two

proteins Our hypothesis was that increased linker length would allow the detection of more

distant interactions We first verified if the augmentation of linker length modified our

capacity to detect interactions without losing specificity New interactions were detected

inside and between complexes Then we validated our capacity to better dissect protein

complexes architecture by studying five protein complexes with different linker length

combinations Finally we confirmed that the method allowed the detection of interactions

that were further in space by comparing our results with distances calculated with available

proteasome structures This variation of DHFR PCA allows to modulate the resolution of PPI

study and thus better define protein complexes architecture

V

Table des matiegraveres

Reacutesumeacute III

Abstract IV

Table des matiegraveres V

Liste des tableaux VII

Listes des figures VIII

Listes des abreacuteviations IX

Remerciements XI

Avant-propos XIII

Introduction geacuteneacuterale 1

11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine 1

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine 2

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions proteacuteine-proteacuteine 3

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification de complexes

proteacuteiques suivie de la spectromeacutetrie de masse 4

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques 5

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine 8

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour moduler la deacutetection des

interactions proteacuteine-proteacuteine 9

16 Objectifs de recherche 9

Measuring proximate protein association in living cells using Protein-fragment complementation

assay (PCA) 11

Reacutesumeacute 11

Abstract 12

Introduction 13

Material and Methods 14

Yeast 14

Bacteria 15

Plasmid construction 15

Strain construction 16

Estimation of protein abundance 16

Protein-fragment complementation assays 17

VI

PCA images and statistical analyses 19

Analysis of protein distances within complexes 21

Results and discussion 22

Longer linkers increase signal-to-noise ratio in large-scale screens 22

PCA signal reflects the super-organization of protein complexes 23

Longer linkers allow detection of more distant proteins in complexes 25

Conclusion 26

Acknowledgements 26

Conclusion geacuteneacuterale 43

Bibliographie 46

VII

Liste des tableaux

Table S1A Description of the strains constructed and used for this study 30

Table S1B PCA data for global PCA experiment 30

Table S1C PCA data for intra-complexes experiment 30

Table S1D PCR primers used in this study 30

Table S2A Distances between C-termini calculated from molecular modeling 31

Table S2B Identity between each RNApol structures and the experimental sequences 32

Table S2C Identity between proteasome structure and the experimental sequence 34

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I II

and III and proteasome structures 37

VIII

Listes des figures

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization of

protein complexes 27

Figure 2 Longer linkers allow for the detection of more distant proteins within complexes

29

Figure S1 Data related to the PCA experiments 40

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins 42

IX

Listes des abreacuteviations

Pourcentage

degC Degreacute Celsius

Aring Aringngstroumlm

ADN Acide deacutesoxyribonucleacuteique

Amp Ampicilline

ARNm Acide ribonucleacuteique messager

BioID laquo Proximity-dependent biotinylation raquo

ClonNAT Nourseacuteothricine

COG laquo Conserved oligomeric Golgi raquo

DHFR Dihydrofolate reacuteductase

DMSO Dimeacutethylsulfoxyde

F[12] Fragment 12 de la DHFR

F[3] Fragment 3 de la DHFR

FDR Valeur P corrigeacutee

FRET Transfert drsquoeacutenergie entre moleacutecules fluorescentes

g Gramme

Gly ou G Glycine

h Heure

HygB Hygromycine B

Is Score drsquointeraction

L Litre

Log Logarithme

M Molaire

Min Minute

mL Millilitre

mM Millimolaire

MS Spectromeacutetrie de masse

MSMS Spectromeacutetrie de masse en tandem

MTX Meacutethotrexate

MYTH laquo Membrane yeast two-hybrid raquo

X

NaCl Chlorure de sodium

NMR Reacutesonance magneacutetique nucleacuteaire

OD Densiteacute optique

PBS Tampon phosphate salin

PCA Compleacutementation de fragments proteacuteiques

PCR Reacuteaction en chaicircne de polymeacuterisation

PKA Proteacuteine kinase A

PPI Interaction proteacuteine-proteacuteine

Q1 Quartile 1

Q3 Quartile 3

r Coefficient de correacutelation

RNApol ARN polymeacuterase

Sdb Deacuteviation standard

Ser ou S Seacuterine

SDS Sodium dodeacutecyl sulfate

SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate

t-test Test de Student

YPD Extrait de levures peptone dextrose

Y2H Double hybride

Zs Score Z

microb Moyenne estimeacutee

microg Microgramme

microL Microlitre

microM Micromolaire

2YT 2 extraits de levures tryptone

2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

XI

Remerciements

Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens

sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon

directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur

de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les

moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes

de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert

des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles

Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr

Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet

Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux

professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la

science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur

disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite

eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur

excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son

entraide sa disponibiliteacute et les discussions entraicircnantes

Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes

supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme

un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une

compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les

rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant

au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave

Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son

sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et

Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller

lors de difficulteacutes

XII

Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis

Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni

non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes

mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire

de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des

responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu

deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un

support moral

Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc

est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son

savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout

moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes

craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour

deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et

professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours

reconnaissante Merci mon amour merci pour tout

XIII

Avant-propos

Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui

sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant

de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application

pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec

Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute

(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de

ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au

baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La

reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et

Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites

conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme

Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee

dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of

protein interactomes reveals their mechanisms of regulation robustness and insights into

genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau

(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette

(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger

(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce

meacutemoire

1

Introduction geacuteneacuterale

11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine

Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du

vivant Leurs associations temporaires ou permanentes sont au cœur des voies de

signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent

interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les

interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les

interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la

cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le

maintien des fonctions cellulaires

Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les

processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination

spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies

comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes

catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute

de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et

reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres

plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le

vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est

relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)

En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre

proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun

complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un

contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association

entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence

drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines

individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe

proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines

obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes

2

est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux

sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie

(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une

cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple

(14-16)

Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-

proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau

drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour

comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute

Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment

lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)

Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces

cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en

consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex

environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles

ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en

perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur

lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel

environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer

une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine

Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la

meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la

comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du

trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees

preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce

complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et

controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et

collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite

divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation

3

de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet

drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa

le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute

systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le

reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le

recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces

complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait

fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait

complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines

essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues

pour la robustesse (31)

Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux

meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe

proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber

seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a

deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major

Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les

infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les

bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe

srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies

mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait

responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les

reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes

perturbations (36)

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions

proteacuteine-proteacuteine

Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont

eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent

toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-

ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre

4

diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition

des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions

physiques entre deux proteacuteines

La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique

soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la

spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de

meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-

hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment

complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est

tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal

lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte

eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes

(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo

(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui

permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans

qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc

agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces

meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent

des informations sur les relations spatiales entre les proteacuteines

Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun

cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles

maintiennent ensemble

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification

de complexes proteacuteiques suivie de la spectromeacutetrie de masse

La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une

meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs

techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie

drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun

extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par

un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees

5

afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite

les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides

et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison

des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines

retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de

masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et

fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre

additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe

drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la

seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal

inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de

leur eacutetude (43)

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques

1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de

fragments proteacuteiques

La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments

rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les

deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs

srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal

Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute

permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique

Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du

galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de

plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune

part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins

des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines

membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part

puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre

localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines

Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle

6

neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est

par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou

lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore

largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain

dans un modegravele plus simple (51)

En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave

laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les

proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute

activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-

ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires

insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la

Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-

50 52 53)

La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct

que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute

cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements

deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments

rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme

proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves

lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme

rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la

cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance

cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN

(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-

agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique

possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines

eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la

localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene

ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)

Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 3: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

III

Reacutesumeacute

Les interactions proteacuteine-proteacuteine (PPI) sont agrave la base du fonctionnement cellulaire de tous

les organismes Regroupeacutees en deux cateacutegories les meacutethodes pour eacutetudier les PPI permettent

soit drsquoidentifier les proteacuteines composant le complexe soit de deacuteterminer les relations entre

les proteacuteines Il existe peu de meacutethodes hybrides permettant drsquoobtenir ces deux informations

et ces meacutethodes comportent plusieurs limitations Le but de ce projet eacutetait de deacutevelopper une

nouvelle meacutethode hybride en modifiant la compleacutementation de fragments proteacuteiques (DHFR

PCA) chez la levure Saccharomyces cerevisiae Le principe de la DHFR PCA repose sur

lrsquoassociation de deux fragments rapporteurs compleacutementaires en preacutesence drsquoune interaction

proteacuteine-proteacuteine Les fragments rapporteurs sont fusionneacutes aux proteacuteines via un connecteur

peptidique La longueur du connecteur limite la distance maximale agrave laquelle il est possible

de deacutetecter une interaction entre deux proteacuteines Notre hypothegravese eacutetait qursquoen augmentant la

longueur du connecteur nous serions en mesure de deacutetecter des interactions plus eacuteloigneacutees

Nous avons drsquoabord veacuterifieacute que lrsquoaugmentation de la longueur du connecteur permettait de

modifier notre capaciteacute agrave deacutetecter des interactions sans toutefois perdre la speacutecificiteacute de la

meacutethode De nouvelles interactions ont eacuteteacute deacutetecteacutees agrave lrsquointeacuterieur drsquoun mecircme complexe

proteacuteique et entre deux complexes Nous avons ensuite valideacute notre capaciteacute agrave mieux

disseacutequer lrsquoarchitecture des complexes proteacuteiques en approfondissant le cas de cinq

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de longueurs de connecteurs Enfin

nous avons confirmeacute que la meacutethode permettait effectivement de deacutetecter des interactions

entre proteacuteines plus distantes en comparant les reacutesultats obtenus aux distances calculeacutees agrave

partir des structures du proteacuteasome disponibles La variation apporteacutee agrave la DHFR PCA

permet de moduler la reacutesolution de lrsquoeacutetude des PPI et ainsi de mieux deacutefinir lrsquoarchitecture

des complexes proteacuteiques

IV

Abstract

Protein-protein interactions (PPI) are central to all cellular processes in all organisms

Grouped in two categories methods to study PPI allow either to identify proteins composing

protein complexes or to determine relationships between proteins Only a few hybrid methods

can be used to obtain both of those informations and these methods present many limitations

The goal of this project was to develop a new hybrid method by modifying the Protein-

fragment complementation assay (DHFR PCA) in the yeast Saccharomyces cerevisiae

DHFR PCA is based on the association of two complementary reporter fragments in presence

of an interaction Both fragments are fused to proteins with a peptide linker Linker length

limits the maximal distance at which it is possible to detect an interaction between two

proteins Our hypothesis was that increased linker length would allow the detection of more

distant interactions We first verified if the augmentation of linker length modified our

capacity to detect interactions without losing specificity New interactions were detected

inside and between complexes Then we validated our capacity to better dissect protein

complexes architecture by studying five protein complexes with different linker length

combinations Finally we confirmed that the method allowed the detection of interactions

that were further in space by comparing our results with distances calculated with available

proteasome structures This variation of DHFR PCA allows to modulate the resolution of PPI

study and thus better define protein complexes architecture

V

Table des matiegraveres

Reacutesumeacute III

Abstract IV

Table des matiegraveres V

Liste des tableaux VII

Listes des figures VIII

Listes des abreacuteviations IX

Remerciements XI

Avant-propos XIII

Introduction geacuteneacuterale 1

11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine 1

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine 2

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions proteacuteine-proteacuteine 3

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification de complexes

proteacuteiques suivie de la spectromeacutetrie de masse 4

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques 5

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine 8

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour moduler la deacutetection des

interactions proteacuteine-proteacuteine 9

16 Objectifs de recherche 9

Measuring proximate protein association in living cells using Protein-fragment complementation

assay (PCA) 11

Reacutesumeacute 11

Abstract 12

Introduction 13

Material and Methods 14

Yeast 14

Bacteria 15

Plasmid construction 15

Strain construction 16

Estimation of protein abundance 16

Protein-fragment complementation assays 17

VI

PCA images and statistical analyses 19

Analysis of protein distances within complexes 21

Results and discussion 22

Longer linkers increase signal-to-noise ratio in large-scale screens 22

PCA signal reflects the super-organization of protein complexes 23

Longer linkers allow detection of more distant proteins in complexes 25

Conclusion 26

Acknowledgements 26

Conclusion geacuteneacuterale 43

Bibliographie 46

VII

Liste des tableaux

Table S1A Description of the strains constructed and used for this study 30

Table S1B PCA data for global PCA experiment 30

Table S1C PCA data for intra-complexes experiment 30

Table S1D PCR primers used in this study 30

Table S2A Distances between C-termini calculated from molecular modeling 31

Table S2B Identity between each RNApol structures and the experimental sequences 32

Table S2C Identity between proteasome structure and the experimental sequence 34

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I II

and III and proteasome structures 37

VIII

Listes des figures

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization of

protein complexes 27

Figure 2 Longer linkers allow for the detection of more distant proteins within complexes

29

Figure S1 Data related to the PCA experiments 40

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins 42

IX

Listes des abreacuteviations

Pourcentage

degC Degreacute Celsius

Aring Aringngstroumlm

ADN Acide deacutesoxyribonucleacuteique

Amp Ampicilline

ARNm Acide ribonucleacuteique messager

BioID laquo Proximity-dependent biotinylation raquo

ClonNAT Nourseacuteothricine

COG laquo Conserved oligomeric Golgi raquo

DHFR Dihydrofolate reacuteductase

DMSO Dimeacutethylsulfoxyde

F[12] Fragment 12 de la DHFR

F[3] Fragment 3 de la DHFR

FDR Valeur P corrigeacutee

FRET Transfert drsquoeacutenergie entre moleacutecules fluorescentes

g Gramme

Gly ou G Glycine

h Heure

HygB Hygromycine B

Is Score drsquointeraction

L Litre

Log Logarithme

M Molaire

Min Minute

mL Millilitre

mM Millimolaire

MS Spectromeacutetrie de masse

MSMS Spectromeacutetrie de masse en tandem

MTX Meacutethotrexate

MYTH laquo Membrane yeast two-hybrid raquo

X

NaCl Chlorure de sodium

NMR Reacutesonance magneacutetique nucleacuteaire

OD Densiteacute optique

PBS Tampon phosphate salin

PCA Compleacutementation de fragments proteacuteiques

PCR Reacuteaction en chaicircne de polymeacuterisation

PKA Proteacuteine kinase A

PPI Interaction proteacuteine-proteacuteine

Q1 Quartile 1

Q3 Quartile 3

r Coefficient de correacutelation

RNApol ARN polymeacuterase

Sdb Deacuteviation standard

Ser ou S Seacuterine

SDS Sodium dodeacutecyl sulfate

SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate

t-test Test de Student

YPD Extrait de levures peptone dextrose

Y2H Double hybride

Zs Score Z

microb Moyenne estimeacutee

microg Microgramme

microL Microlitre

microM Micromolaire

2YT 2 extraits de levures tryptone

2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

XI

Remerciements

Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens

sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon

directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur

de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les

moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes

de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert

des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles

Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr

Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet

Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux

professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la

science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur

disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite

eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur

excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son

entraide sa disponibiliteacute et les discussions entraicircnantes

Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes

supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme

un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une

compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les

rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant

au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave

Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son

sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et

Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller

lors de difficulteacutes

XII

Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis

Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni

non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes

mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire

de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des

responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu

deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un

support moral

Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc

est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son

savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout

moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes

craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour

deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et

professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours

reconnaissante Merci mon amour merci pour tout

XIII

Avant-propos

Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui

sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant

de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application

pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec

Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute

(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de

ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au

baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La

reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et

Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites

conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme

Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee

dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of

protein interactomes reveals their mechanisms of regulation robustness and insights into

genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau

(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette

(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger

(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce

meacutemoire

1

Introduction geacuteneacuterale

11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine

Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du

vivant Leurs associations temporaires ou permanentes sont au cœur des voies de

signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent

interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les

interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les

interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la

cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le

maintien des fonctions cellulaires

Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les

processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination

spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies

comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes

catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute

de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et

reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres

plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le

vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est

relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)

En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre

proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun

complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un

contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association

entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence

drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines

individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe

proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines

obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes

2

est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux

sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie

(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une

cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple

(14-16)

Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-

proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau

drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour

comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute

Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment

lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)

Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces

cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en

consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex

environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles

ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en

perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur

lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel

environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer

une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine

Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la

meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la

comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du

trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees

preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce

complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et

controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et

collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite

divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation

3

de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet

drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa

le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute

systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le

reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le

recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces

complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait

fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait

complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines

essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues

pour la robustesse (31)

Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux

meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe

proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber

seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a

deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major

Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les

infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les

bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe

srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies

mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait

responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les

reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes

perturbations (36)

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions

proteacuteine-proteacuteine

Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont

eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent

toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-

ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre

4

diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition

des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions

physiques entre deux proteacuteines

La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique

soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la

spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de

meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-

hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment

complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est

tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal

lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte

eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes

(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo

(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui

permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans

qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc

agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces

meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent

des informations sur les relations spatiales entre les proteacuteines

Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun

cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles

maintiennent ensemble

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification

de complexes proteacuteiques suivie de la spectromeacutetrie de masse

La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une

meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs

techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie

drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun

extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par

un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees

5

afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite

les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides

et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison

des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines

retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de

masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et

fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre

additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe

drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la

seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal

inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de

leur eacutetude (43)

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques

1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de

fragments proteacuteiques

La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments

rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les

deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs

srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal

Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute

permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique

Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du

galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de

plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune

part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins

des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines

membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part

puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre

localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines

Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle

6

neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est

par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou

lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore

largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain

dans un modegravele plus simple (51)

En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave

laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les

proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute

activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-

ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires

insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la

Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-

50 52 53)

La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct

que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute

cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements

deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments

rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme

proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves

lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme

rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la

cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance

cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN

(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-

agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique

possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines

eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la

localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene

ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)

Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 4: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

IV

Abstract

Protein-protein interactions (PPI) are central to all cellular processes in all organisms

Grouped in two categories methods to study PPI allow either to identify proteins composing

protein complexes or to determine relationships between proteins Only a few hybrid methods

can be used to obtain both of those informations and these methods present many limitations

The goal of this project was to develop a new hybrid method by modifying the Protein-

fragment complementation assay (DHFR PCA) in the yeast Saccharomyces cerevisiae

DHFR PCA is based on the association of two complementary reporter fragments in presence

of an interaction Both fragments are fused to proteins with a peptide linker Linker length

limits the maximal distance at which it is possible to detect an interaction between two

proteins Our hypothesis was that increased linker length would allow the detection of more

distant interactions We first verified if the augmentation of linker length modified our

capacity to detect interactions without losing specificity New interactions were detected

inside and between complexes Then we validated our capacity to better dissect protein

complexes architecture by studying five protein complexes with different linker length

combinations Finally we confirmed that the method allowed the detection of interactions

that were further in space by comparing our results with distances calculated with available

proteasome structures This variation of DHFR PCA allows to modulate the resolution of PPI

study and thus better define protein complexes architecture

V

Table des matiegraveres

Reacutesumeacute III

Abstract IV

Table des matiegraveres V

Liste des tableaux VII

Listes des figures VIII

Listes des abreacuteviations IX

Remerciements XI

Avant-propos XIII

Introduction geacuteneacuterale 1

11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine 1

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine 2

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions proteacuteine-proteacuteine 3

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification de complexes

proteacuteiques suivie de la spectromeacutetrie de masse 4

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques 5

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine 8

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour moduler la deacutetection des

interactions proteacuteine-proteacuteine 9

16 Objectifs de recherche 9

Measuring proximate protein association in living cells using Protein-fragment complementation

assay (PCA) 11

Reacutesumeacute 11

Abstract 12

Introduction 13

Material and Methods 14

Yeast 14

Bacteria 15

Plasmid construction 15

Strain construction 16

Estimation of protein abundance 16

Protein-fragment complementation assays 17

VI

PCA images and statistical analyses 19

Analysis of protein distances within complexes 21

Results and discussion 22

Longer linkers increase signal-to-noise ratio in large-scale screens 22

PCA signal reflects the super-organization of protein complexes 23

Longer linkers allow detection of more distant proteins in complexes 25

Conclusion 26

Acknowledgements 26

Conclusion geacuteneacuterale 43

Bibliographie 46

VII

Liste des tableaux

Table S1A Description of the strains constructed and used for this study 30

Table S1B PCA data for global PCA experiment 30

Table S1C PCA data for intra-complexes experiment 30

Table S1D PCR primers used in this study 30

Table S2A Distances between C-termini calculated from molecular modeling 31

Table S2B Identity between each RNApol structures and the experimental sequences 32

Table S2C Identity between proteasome structure and the experimental sequence 34

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I II

and III and proteasome structures 37

VIII

Listes des figures

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization of

protein complexes 27

Figure 2 Longer linkers allow for the detection of more distant proteins within complexes

29

Figure S1 Data related to the PCA experiments 40

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins 42

IX

Listes des abreacuteviations

Pourcentage

degC Degreacute Celsius

Aring Aringngstroumlm

ADN Acide deacutesoxyribonucleacuteique

Amp Ampicilline

ARNm Acide ribonucleacuteique messager

BioID laquo Proximity-dependent biotinylation raquo

ClonNAT Nourseacuteothricine

COG laquo Conserved oligomeric Golgi raquo

DHFR Dihydrofolate reacuteductase

DMSO Dimeacutethylsulfoxyde

F[12] Fragment 12 de la DHFR

F[3] Fragment 3 de la DHFR

FDR Valeur P corrigeacutee

FRET Transfert drsquoeacutenergie entre moleacutecules fluorescentes

g Gramme

Gly ou G Glycine

h Heure

HygB Hygromycine B

Is Score drsquointeraction

L Litre

Log Logarithme

M Molaire

Min Minute

mL Millilitre

mM Millimolaire

MS Spectromeacutetrie de masse

MSMS Spectromeacutetrie de masse en tandem

MTX Meacutethotrexate

MYTH laquo Membrane yeast two-hybrid raquo

X

NaCl Chlorure de sodium

NMR Reacutesonance magneacutetique nucleacuteaire

OD Densiteacute optique

PBS Tampon phosphate salin

PCA Compleacutementation de fragments proteacuteiques

PCR Reacuteaction en chaicircne de polymeacuterisation

PKA Proteacuteine kinase A

PPI Interaction proteacuteine-proteacuteine

Q1 Quartile 1

Q3 Quartile 3

r Coefficient de correacutelation

RNApol ARN polymeacuterase

Sdb Deacuteviation standard

Ser ou S Seacuterine

SDS Sodium dodeacutecyl sulfate

SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate

t-test Test de Student

YPD Extrait de levures peptone dextrose

Y2H Double hybride

Zs Score Z

microb Moyenne estimeacutee

microg Microgramme

microL Microlitre

microM Micromolaire

2YT 2 extraits de levures tryptone

2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

XI

Remerciements

Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens

sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon

directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur

de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les

moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes

de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert

des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles

Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr

Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet

Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux

professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la

science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur

disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite

eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur

excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son

entraide sa disponibiliteacute et les discussions entraicircnantes

Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes

supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme

un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une

compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les

rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant

au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave

Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son

sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et

Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller

lors de difficulteacutes

XII

Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis

Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni

non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes

mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire

de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des

responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu

deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un

support moral

Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc

est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son

savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout

moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes

craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour

deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et

professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours

reconnaissante Merci mon amour merci pour tout

XIII

Avant-propos

Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui

sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant

de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application

pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec

Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute

(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de

ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au

baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La

reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et

Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites

conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme

Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee

dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of

protein interactomes reveals their mechanisms of regulation robustness and insights into

genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau

(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette

(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger

(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce

meacutemoire

1

Introduction geacuteneacuterale

11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine

Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du

vivant Leurs associations temporaires ou permanentes sont au cœur des voies de

signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent

interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les

interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les

interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la

cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le

maintien des fonctions cellulaires

Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les

processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination

spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies

comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes

catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute

de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et

reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres

plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le

vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est

relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)

En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre

proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun

complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un

contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association

entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence

drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines

individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe

proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines

obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes

2

est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux

sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie

(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une

cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple

(14-16)

Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-

proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau

drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour

comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute

Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment

lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)

Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces

cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en

consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex

environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles

ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en

perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur

lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel

environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer

une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine

Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la

meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la

comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du

trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees

preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce

complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et

controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et

collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite

divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation

3

de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet

drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa

le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute

systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le

reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le

recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces

complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait

fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait

complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines

essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues

pour la robustesse (31)

Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux

meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe

proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber

seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a

deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major

Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les

infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les

bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe

srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies

mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait

responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les

reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes

perturbations (36)

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions

proteacuteine-proteacuteine

Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont

eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent

toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-

ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre

4

diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition

des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions

physiques entre deux proteacuteines

La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique

soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la

spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de

meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-

hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment

complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est

tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal

lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte

eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes

(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo

(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui

permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans

qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc

agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces

meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent

des informations sur les relations spatiales entre les proteacuteines

Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun

cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles

maintiennent ensemble

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification

de complexes proteacuteiques suivie de la spectromeacutetrie de masse

La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une

meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs

techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie

drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun

extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par

un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees

5

afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite

les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides

et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison

des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines

retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de

masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et

fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre

additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe

drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la

seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal

inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de

leur eacutetude (43)

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques

1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de

fragments proteacuteiques

La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments

rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les

deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs

srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal

Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute

permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique

Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du

galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de

plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune

part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins

des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines

membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part

puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre

localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines

Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle

6

neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est

par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou

lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore

largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain

dans un modegravele plus simple (51)

En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave

laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les

proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute

activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-

ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires

insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la

Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-

50 52 53)

La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct

que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute

cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements

deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments

rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme

proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves

lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme

rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la

cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance

cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN

(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-

agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique

possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines

eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la

localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene

ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)

Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 5: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

V

Table des matiegraveres

Reacutesumeacute III

Abstract IV

Table des matiegraveres V

Liste des tableaux VII

Listes des figures VIII

Listes des abreacuteviations IX

Remerciements XI

Avant-propos XIII

Introduction geacuteneacuterale 1

11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine 1

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine 2

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions proteacuteine-proteacuteine 3

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification de complexes

proteacuteiques suivie de la spectromeacutetrie de masse 4

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques 5

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine 8

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour moduler la deacutetection des

interactions proteacuteine-proteacuteine 9

16 Objectifs de recherche 9

Measuring proximate protein association in living cells using Protein-fragment complementation

assay (PCA) 11

Reacutesumeacute 11

Abstract 12

Introduction 13

Material and Methods 14

Yeast 14

Bacteria 15

Plasmid construction 15

Strain construction 16

Estimation of protein abundance 16

Protein-fragment complementation assays 17

VI

PCA images and statistical analyses 19

Analysis of protein distances within complexes 21

Results and discussion 22

Longer linkers increase signal-to-noise ratio in large-scale screens 22

PCA signal reflects the super-organization of protein complexes 23

Longer linkers allow detection of more distant proteins in complexes 25

Conclusion 26

Acknowledgements 26

Conclusion geacuteneacuterale 43

Bibliographie 46

VII

Liste des tableaux

Table S1A Description of the strains constructed and used for this study 30

Table S1B PCA data for global PCA experiment 30

Table S1C PCA data for intra-complexes experiment 30

Table S1D PCR primers used in this study 30

Table S2A Distances between C-termini calculated from molecular modeling 31

Table S2B Identity between each RNApol structures and the experimental sequences 32

Table S2C Identity between proteasome structure and the experimental sequence 34

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I II

and III and proteasome structures 37

VIII

Listes des figures

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization of

protein complexes 27

Figure 2 Longer linkers allow for the detection of more distant proteins within complexes

29

Figure S1 Data related to the PCA experiments 40

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins 42

IX

Listes des abreacuteviations

Pourcentage

degC Degreacute Celsius

Aring Aringngstroumlm

ADN Acide deacutesoxyribonucleacuteique

Amp Ampicilline

ARNm Acide ribonucleacuteique messager

BioID laquo Proximity-dependent biotinylation raquo

ClonNAT Nourseacuteothricine

COG laquo Conserved oligomeric Golgi raquo

DHFR Dihydrofolate reacuteductase

DMSO Dimeacutethylsulfoxyde

F[12] Fragment 12 de la DHFR

F[3] Fragment 3 de la DHFR

FDR Valeur P corrigeacutee

FRET Transfert drsquoeacutenergie entre moleacutecules fluorescentes

g Gramme

Gly ou G Glycine

h Heure

HygB Hygromycine B

Is Score drsquointeraction

L Litre

Log Logarithme

M Molaire

Min Minute

mL Millilitre

mM Millimolaire

MS Spectromeacutetrie de masse

MSMS Spectromeacutetrie de masse en tandem

MTX Meacutethotrexate

MYTH laquo Membrane yeast two-hybrid raquo

X

NaCl Chlorure de sodium

NMR Reacutesonance magneacutetique nucleacuteaire

OD Densiteacute optique

PBS Tampon phosphate salin

PCA Compleacutementation de fragments proteacuteiques

PCR Reacuteaction en chaicircne de polymeacuterisation

PKA Proteacuteine kinase A

PPI Interaction proteacuteine-proteacuteine

Q1 Quartile 1

Q3 Quartile 3

r Coefficient de correacutelation

RNApol ARN polymeacuterase

Sdb Deacuteviation standard

Ser ou S Seacuterine

SDS Sodium dodeacutecyl sulfate

SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate

t-test Test de Student

YPD Extrait de levures peptone dextrose

Y2H Double hybride

Zs Score Z

microb Moyenne estimeacutee

microg Microgramme

microL Microlitre

microM Micromolaire

2YT 2 extraits de levures tryptone

2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

XI

Remerciements

Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens

sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon

directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur

de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les

moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes

de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert

des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles

Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr

Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet

Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux

professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la

science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur

disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite

eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur

excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son

entraide sa disponibiliteacute et les discussions entraicircnantes

Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes

supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme

un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une

compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les

rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant

au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave

Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son

sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et

Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller

lors de difficulteacutes

XII

Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis

Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni

non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes

mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire

de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des

responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu

deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un

support moral

Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc

est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son

savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout

moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes

craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour

deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et

professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours

reconnaissante Merci mon amour merci pour tout

XIII

Avant-propos

Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui

sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant

de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application

pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec

Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute

(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de

ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au

baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La

reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et

Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites

conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme

Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee

dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of

protein interactomes reveals their mechanisms of regulation robustness and insights into

genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau

(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette

(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger

(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce

meacutemoire

1

Introduction geacuteneacuterale

11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine

Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du

vivant Leurs associations temporaires ou permanentes sont au cœur des voies de

signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent

interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les

interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les

interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la

cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le

maintien des fonctions cellulaires

Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les

processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination

spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies

comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes

catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute

de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et

reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres

plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le

vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est

relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)

En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre

proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun

complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un

contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association

entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence

drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines

individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe

proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines

obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes

2

est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux

sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie

(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une

cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple

(14-16)

Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-

proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau

drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour

comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute

Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment

lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)

Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces

cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en

consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex

environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles

ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en

perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur

lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel

environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer

une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine

Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la

meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la

comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du

trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees

preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce

complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et

controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et

collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite

divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation

3

de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet

drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa

le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute

systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le

reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le

recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces

complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait

fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait

complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines

essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues

pour la robustesse (31)

Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux

meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe

proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber

seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a

deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major

Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les

infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les

bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe

srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies

mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait

responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les

reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes

perturbations (36)

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions

proteacuteine-proteacuteine

Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont

eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent

toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-

ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre

4

diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition

des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions

physiques entre deux proteacuteines

La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique

soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la

spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de

meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-

hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment

complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est

tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal

lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte

eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes

(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo

(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui

permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans

qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc

agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces

meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent

des informations sur les relations spatiales entre les proteacuteines

Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun

cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles

maintiennent ensemble

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification

de complexes proteacuteiques suivie de la spectromeacutetrie de masse

La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une

meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs

techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie

drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun

extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par

un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees

5

afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite

les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides

et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison

des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines

retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de

masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et

fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre

additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe

drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la

seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal

inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de

leur eacutetude (43)

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques

1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de

fragments proteacuteiques

La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments

rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les

deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs

srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal

Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute

permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique

Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du

galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de

plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune

part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins

des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines

membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part

puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre

localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines

Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle

6

neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est

par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou

lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore

largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain

dans un modegravele plus simple (51)

En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave

laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les

proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute

activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-

ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires

insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la

Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-

50 52 53)

La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct

que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute

cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements

deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments

rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme

proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves

lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme

rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la

cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance

cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN

(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-

agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique

possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines

eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la

localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene

ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)

Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 6: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

VI

PCA images and statistical analyses 19

Analysis of protein distances within complexes 21

Results and discussion 22

Longer linkers increase signal-to-noise ratio in large-scale screens 22

PCA signal reflects the super-organization of protein complexes 23

Longer linkers allow detection of more distant proteins in complexes 25

Conclusion 26

Acknowledgements 26

Conclusion geacuteneacuterale 43

Bibliographie 46

VII

Liste des tableaux

Table S1A Description of the strains constructed and used for this study 30

Table S1B PCA data for global PCA experiment 30

Table S1C PCA data for intra-complexes experiment 30

Table S1D PCR primers used in this study 30

Table S2A Distances between C-termini calculated from molecular modeling 31

Table S2B Identity between each RNApol structures and the experimental sequences 32

Table S2C Identity between proteasome structure and the experimental sequence 34

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I II

and III and proteasome structures 37

VIII

Listes des figures

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization of

protein complexes 27

Figure 2 Longer linkers allow for the detection of more distant proteins within complexes

29

Figure S1 Data related to the PCA experiments 40

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins 42

IX

Listes des abreacuteviations

Pourcentage

degC Degreacute Celsius

Aring Aringngstroumlm

ADN Acide deacutesoxyribonucleacuteique

Amp Ampicilline

ARNm Acide ribonucleacuteique messager

BioID laquo Proximity-dependent biotinylation raquo

ClonNAT Nourseacuteothricine

COG laquo Conserved oligomeric Golgi raquo

DHFR Dihydrofolate reacuteductase

DMSO Dimeacutethylsulfoxyde

F[12] Fragment 12 de la DHFR

F[3] Fragment 3 de la DHFR

FDR Valeur P corrigeacutee

FRET Transfert drsquoeacutenergie entre moleacutecules fluorescentes

g Gramme

Gly ou G Glycine

h Heure

HygB Hygromycine B

Is Score drsquointeraction

L Litre

Log Logarithme

M Molaire

Min Minute

mL Millilitre

mM Millimolaire

MS Spectromeacutetrie de masse

MSMS Spectromeacutetrie de masse en tandem

MTX Meacutethotrexate

MYTH laquo Membrane yeast two-hybrid raquo

X

NaCl Chlorure de sodium

NMR Reacutesonance magneacutetique nucleacuteaire

OD Densiteacute optique

PBS Tampon phosphate salin

PCA Compleacutementation de fragments proteacuteiques

PCR Reacuteaction en chaicircne de polymeacuterisation

PKA Proteacuteine kinase A

PPI Interaction proteacuteine-proteacuteine

Q1 Quartile 1

Q3 Quartile 3

r Coefficient de correacutelation

RNApol ARN polymeacuterase

Sdb Deacuteviation standard

Ser ou S Seacuterine

SDS Sodium dodeacutecyl sulfate

SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate

t-test Test de Student

YPD Extrait de levures peptone dextrose

Y2H Double hybride

Zs Score Z

microb Moyenne estimeacutee

microg Microgramme

microL Microlitre

microM Micromolaire

2YT 2 extraits de levures tryptone

2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

XI

Remerciements

Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens

sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon

directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur

de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les

moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes

de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert

des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles

Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr

Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet

Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux

professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la

science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur

disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite

eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur

excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son

entraide sa disponibiliteacute et les discussions entraicircnantes

Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes

supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme

un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une

compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les

rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant

au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave

Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son

sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et

Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller

lors de difficulteacutes

XII

Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis

Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni

non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes

mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire

de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des

responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu

deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un

support moral

Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc

est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son

savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout

moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes

craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour

deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et

professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours

reconnaissante Merci mon amour merci pour tout

XIII

Avant-propos

Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui

sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant

de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application

pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec

Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute

(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de

ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au

baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La

reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et

Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites

conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme

Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee

dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of

protein interactomes reveals their mechanisms of regulation robustness and insights into

genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau

(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette

(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger

(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce

meacutemoire

1

Introduction geacuteneacuterale

11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine

Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du

vivant Leurs associations temporaires ou permanentes sont au cœur des voies de

signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent

interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les

interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les

interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la

cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le

maintien des fonctions cellulaires

Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les

processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination

spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies

comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes

catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute

de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et

reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres

plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le

vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est

relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)

En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre

proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun

complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un

contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association

entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence

drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines

individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe

proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines

obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes

2

est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux

sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie

(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une

cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple

(14-16)

Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-

proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau

drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour

comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute

Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment

lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)

Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces

cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en

consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex

environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles

ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en

perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur

lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel

environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer

une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine

Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la

meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la

comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du

trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees

preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce

complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et

controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et

collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite

divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation

3

de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet

drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa

le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute

systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le

reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le

recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces

complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait

fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait

complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines

essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues

pour la robustesse (31)

Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux

meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe

proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber

seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a

deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major

Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les

infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les

bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe

srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies

mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait

responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les

reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes

perturbations (36)

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions

proteacuteine-proteacuteine

Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont

eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent

toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-

ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre

4

diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition

des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions

physiques entre deux proteacuteines

La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique

soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la

spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de

meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-

hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment

complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est

tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal

lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte

eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes

(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo

(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui

permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans

qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc

agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces

meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent

des informations sur les relations spatiales entre les proteacuteines

Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun

cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles

maintiennent ensemble

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification

de complexes proteacuteiques suivie de la spectromeacutetrie de masse

La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une

meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs

techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie

drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun

extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par

un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees

5

afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite

les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides

et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison

des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines

retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de

masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et

fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre

additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe

drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la

seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal

inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de

leur eacutetude (43)

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques

1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de

fragments proteacuteiques

La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments

rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les

deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs

srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal

Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute

permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique

Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du

galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de

plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune

part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins

des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines

membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part

puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre

localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines

Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle

6

neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est

par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou

lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore

largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain

dans un modegravele plus simple (51)

En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave

laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les

proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute

activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-

ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires

insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la

Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-

50 52 53)

La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct

que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute

cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements

deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments

rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme

proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves

lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme

rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la

cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance

cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN

(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-

agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique

possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines

eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la

localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene

ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)

Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 7: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

VII

Liste des tableaux

Table S1A Description of the strains constructed and used for this study 30

Table S1B PCA data for global PCA experiment 30

Table S1C PCA data for intra-complexes experiment 30

Table S1D PCR primers used in this study 30

Table S2A Distances between C-termini calculated from molecular modeling 31

Table S2B Identity between each RNApol structures and the experimental sequences 32

Table S2C Identity between proteasome structure and the experimental sequence 34

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I II

and III and proteasome structures 37

VIII

Listes des figures

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization of

protein complexes 27

Figure 2 Longer linkers allow for the detection of more distant proteins within complexes

29

Figure S1 Data related to the PCA experiments 40

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins 42

IX

Listes des abreacuteviations

Pourcentage

degC Degreacute Celsius

Aring Aringngstroumlm

ADN Acide deacutesoxyribonucleacuteique

Amp Ampicilline

ARNm Acide ribonucleacuteique messager

BioID laquo Proximity-dependent biotinylation raquo

ClonNAT Nourseacuteothricine

COG laquo Conserved oligomeric Golgi raquo

DHFR Dihydrofolate reacuteductase

DMSO Dimeacutethylsulfoxyde

F[12] Fragment 12 de la DHFR

F[3] Fragment 3 de la DHFR

FDR Valeur P corrigeacutee

FRET Transfert drsquoeacutenergie entre moleacutecules fluorescentes

g Gramme

Gly ou G Glycine

h Heure

HygB Hygromycine B

Is Score drsquointeraction

L Litre

Log Logarithme

M Molaire

Min Minute

mL Millilitre

mM Millimolaire

MS Spectromeacutetrie de masse

MSMS Spectromeacutetrie de masse en tandem

MTX Meacutethotrexate

MYTH laquo Membrane yeast two-hybrid raquo

X

NaCl Chlorure de sodium

NMR Reacutesonance magneacutetique nucleacuteaire

OD Densiteacute optique

PBS Tampon phosphate salin

PCA Compleacutementation de fragments proteacuteiques

PCR Reacuteaction en chaicircne de polymeacuterisation

PKA Proteacuteine kinase A

PPI Interaction proteacuteine-proteacuteine

Q1 Quartile 1

Q3 Quartile 3

r Coefficient de correacutelation

RNApol ARN polymeacuterase

Sdb Deacuteviation standard

Ser ou S Seacuterine

SDS Sodium dodeacutecyl sulfate

SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate

t-test Test de Student

YPD Extrait de levures peptone dextrose

Y2H Double hybride

Zs Score Z

microb Moyenne estimeacutee

microg Microgramme

microL Microlitre

microM Micromolaire

2YT 2 extraits de levures tryptone

2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

XI

Remerciements

Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens

sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon

directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur

de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les

moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes

de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert

des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles

Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr

Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet

Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux

professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la

science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur

disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite

eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur

excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son

entraide sa disponibiliteacute et les discussions entraicircnantes

Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes

supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme

un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une

compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les

rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant

au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave

Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son

sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et

Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller

lors de difficulteacutes

XII

Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis

Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni

non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes

mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire

de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des

responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu

deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un

support moral

Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc

est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son

savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout

moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes

craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour

deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et

professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours

reconnaissante Merci mon amour merci pour tout

XIII

Avant-propos

Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui

sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant

de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application

pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec

Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute

(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de

ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au

baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La

reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et

Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites

conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme

Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee

dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of

protein interactomes reveals their mechanisms of regulation robustness and insights into

genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau

(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette

(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger

(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce

meacutemoire

1

Introduction geacuteneacuterale

11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine

Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du

vivant Leurs associations temporaires ou permanentes sont au cœur des voies de

signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent

interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les

interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les

interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la

cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le

maintien des fonctions cellulaires

Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les

processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination

spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies

comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes

catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute

de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et

reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres

plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le

vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est

relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)

En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre

proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun

complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un

contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association

entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence

drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines

individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe

proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines

obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes

2

est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux

sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie

(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une

cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple

(14-16)

Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-

proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau

drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour

comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute

Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment

lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)

Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces

cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en

consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex

environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles

ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en

perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur

lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel

environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer

une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine

Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la

meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la

comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du

trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees

preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce

complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et

controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et

collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite

divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation

3

de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet

drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa

le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute

systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le

reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le

recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces

complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait

fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait

complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines

essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues

pour la robustesse (31)

Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux

meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe

proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber

seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a

deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major

Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les

infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les

bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe

srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies

mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait

responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les

reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes

perturbations (36)

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions

proteacuteine-proteacuteine

Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont

eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent

toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-

ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre

4

diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition

des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions

physiques entre deux proteacuteines

La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique

soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la

spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de

meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-

hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment

complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est

tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal

lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte

eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes

(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo

(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui

permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans

qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc

agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces

meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent

des informations sur les relations spatiales entre les proteacuteines

Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun

cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles

maintiennent ensemble

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification

de complexes proteacuteiques suivie de la spectromeacutetrie de masse

La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une

meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs

techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie

drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun

extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par

un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees

5

afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite

les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides

et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison

des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines

retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de

masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et

fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre

additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe

drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la

seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal

inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de

leur eacutetude (43)

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques

1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de

fragments proteacuteiques

La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments

rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les

deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs

srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal

Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute

permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique

Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du

galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de

plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune

part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins

des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines

membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part

puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre

localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines

Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle

6

neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est

par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou

lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore

largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain

dans un modegravele plus simple (51)

En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave

laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les

proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute

activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-

ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires

insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la

Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-

50 52 53)

La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct

que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute

cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements

deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments

rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme

proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves

lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme

rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la

cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance

cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN

(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-

agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique

possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines

eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la

localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene

ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)

Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 8: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

VIII

Listes des figures

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization of

protein complexes 27

Figure 2 Longer linkers allow for the detection of more distant proteins within complexes

29

Figure S1 Data related to the PCA experiments 40

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins 42

IX

Listes des abreacuteviations

Pourcentage

degC Degreacute Celsius

Aring Aringngstroumlm

ADN Acide deacutesoxyribonucleacuteique

Amp Ampicilline

ARNm Acide ribonucleacuteique messager

BioID laquo Proximity-dependent biotinylation raquo

ClonNAT Nourseacuteothricine

COG laquo Conserved oligomeric Golgi raquo

DHFR Dihydrofolate reacuteductase

DMSO Dimeacutethylsulfoxyde

F[12] Fragment 12 de la DHFR

F[3] Fragment 3 de la DHFR

FDR Valeur P corrigeacutee

FRET Transfert drsquoeacutenergie entre moleacutecules fluorescentes

g Gramme

Gly ou G Glycine

h Heure

HygB Hygromycine B

Is Score drsquointeraction

L Litre

Log Logarithme

M Molaire

Min Minute

mL Millilitre

mM Millimolaire

MS Spectromeacutetrie de masse

MSMS Spectromeacutetrie de masse en tandem

MTX Meacutethotrexate

MYTH laquo Membrane yeast two-hybrid raquo

X

NaCl Chlorure de sodium

NMR Reacutesonance magneacutetique nucleacuteaire

OD Densiteacute optique

PBS Tampon phosphate salin

PCA Compleacutementation de fragments proteacuteiques

PCR Reacuteaction en chaicircne de polymeacuterisation

PKA Proteacuteine kinase A

PPI Interaction proteacuteine-proteacuteine

Q1 Quartile 1

Q3 Quartile 3

r Coefficient de correacutelation

RNApol ARN polymeacuterase

Sdb Deacuteviation standard

Ser ou S Seacuterine

SDS Sodium dodeacutecyl sulfate

SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate

t-test Test de Student

YPD Extrait de levures peptone dextrose

Y2H Double hybride

Zs Score Z

microb Moyenne estimeacutee

microg Microgramme

microL Microlitre

microM Micromolaire

2YT 2 extraits de levures tryptone

2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

XI

Remerciements

Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens

sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon

directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur

de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les

moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes

de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert

des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles

Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr

Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet

Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux

professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la

science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur

disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite

eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur

excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son

entraide sa disponibiliteacute et les discussions entraicircnantes

Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes

supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme

un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une

compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les

rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant

au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave

Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son

sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et

Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller

lors de difficulteacutes

XII

Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis

Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni

non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes

mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire

de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des

responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu

deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un

support moral

Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc

est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son

savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout

moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes

craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour

deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et

professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours

reconnaissante Merci mon amour merci pour tout

XIII

Avant-propos

Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui

sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant

de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application

pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec

Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute

(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de

ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au

baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La

reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et

Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites

conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme

Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee

dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of

protein interactomes reveals their mechanisms of regulation robustness and insights into

genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau

(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette

(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger

(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce

meacutemoire

1

Introduction geacuteneacuterale

11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine

Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du

vivant Leurs associations temporaires ou permanentes sont au cœur des voies de

signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent

interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les

interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les

interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la

cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le

maintien des fonctions cellulaires

Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les

processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination

spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies

comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes

catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute

de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et

reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres

plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le

vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est

relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)

En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre

proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun

complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un

contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association

entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence

drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines

individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe

proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines

obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes

2

est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux

sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie

(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une

cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple

(14-16)

Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-

proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau

drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour

comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute

Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment

lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)

Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces

cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en

consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex

environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles

ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en

perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur

lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel

environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer

une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine

Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la

meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la

comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du

trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees

preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce

complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et

controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et

collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite

divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation

3

de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet

drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa

le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute

systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le

reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le

recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces

complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait

fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait

complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines

essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues

pour la robustesse (31)

Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux

meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe

proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber

seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a

deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major

Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les

infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les

bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe

srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies

mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait

responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les

reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes

perturbations (36)

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions

proteacuteine-proteacuteine

Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont

eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent

toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-

ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre

4

diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition

des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions

physiques entre deux proteacuteines

La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique

soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la

spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de

meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-

hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment

complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est

tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal

lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte

eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes

(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo

(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui

permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans

qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc

agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces

meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent

des informations sur les relations spatiales entre les proteacuteines

Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun

cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles

maintiennent ensemble

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification

de complexes proteacuteiques suivie de la spectromeacutetrie de masse

La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une

meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs

techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie

drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun

extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par

un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees

5

afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite

les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides

et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison

des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines

retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de

masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et

fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre

additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe

drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la

seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal

inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de

leur eacutetude (43)

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques

1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de

fragments proteacuteiques

La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments

rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les

deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs

srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal

Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute

permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique

Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du

galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de

plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune

part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins

des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines

membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part

puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre

localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines

Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle

6

neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est

par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou

lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore

largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain

dans un modegravele plus simple (51)

En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave

laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les

proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute

activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-

ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires

insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la

Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-

50 52 53)

La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct

que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute

cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements

deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments

rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme

proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves

lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme

rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la

cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance

cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN

(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-

agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique

possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines

eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la

localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene

ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)

Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 9: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

IX

Listes des abreacuteviations

Pourcentage

degC Degreacute Celsius

Aring Aringngstroumlm

ADN Acide deacutesoxyribonucleacuteique

Amp Ampicilline

ARNm Acide ribonucleacuteique messager

BioID laquo Proximity-dependent biotinylation raquo

ClonNAT Nourseacuteothricine

COG laquo Conserved oligomeric Golgi raquo

DHFR Dihydrofolate reacuteductase

DMSO Dimeacutethylsulfoxyde

F[12] Fragment 12 de la DHFR

F[3] Fragment 3 de la DHFR

FDR Valeur P corrigeacutee

FRET Transfert drsquoeacutenergie entre moleacutecules fluorescentes

g Gramme

Gly ou G Glycine

h Heure

HygB Hygromycine B

Is Score drsquointeraction

L Litre

Log Logarithme

M Molaire

Min Minute

mL Millilitre

mM Millimolaire

MS Spectromeacutetrie de masse

MSMS Spectromeacutetrie de masse en tandem

MTX Meacutethotrexate

MYTH laquo Membrane yeast two-hybrid raquo

X

NaCl Chlorure de sodium

NMR Reacutesonance magneacutetique nucleacuteaire

OD Densiteacute optique

PBS Tampon phosphate salin

PCA Compleacutementation de fragments proteacuteiques

PCR Reacuteaction en chaicircne de polymeacuterisation

PKA Proteacuteine kinase A

PPI Interaction proteacuteine-proteacuteine

Q1 Quartile 1

Q3 Quartile 3

r Coefficient de correacutelation

RNApol ARN polymeacuterase

Sdb Deacuteviation standard

Ser ou S Seacuterine

SDS Sodium dodeacutecyl sulfate

SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate

t-test Test de Student

YPD Extrait de levures peptone dextrose

Y2H Double hybride

Zs Score Z

microb Moyenne estimeacutee

microg Microgramme

microL Microlitre

microM Micromolaire

2YT 2 extraits de levures tryptone

2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

XI

Remerciements

Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens

sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon

directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur

de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les

moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes

de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert

des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles

Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr

Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet

Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux

professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la

science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur

disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite

eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur

excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son

entraide sa disponibiliteacute et les discussions entraicircnantes

Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes

supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme

un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une

compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les

rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant

au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave

Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son

sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et

Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller

lors de difficulteacutes

XII

Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis

Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni

non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes

mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire

de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des

responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu

deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un

support moral

Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc

est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son

savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout

moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes

craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour

deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et

professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours

reconnaissante Merci mon amour merci pour tout

XIII

Avant-propos

Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui

sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant

de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application

pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec

Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute

(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de

ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au

baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La

reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et

Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites

conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme

Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee

dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of

protein interactomes reveals their mechanisms of regulation robustness and insights into

genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau

(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette

(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger

(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce

meacutemoire

1

Introduction geacuteneacuterale

11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine

Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du

vivant Leurs associations temporaires ou permanentes sont au cœur des voies de

signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent

interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les

interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les

interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la

cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le

maintien des fonctions cellulaires

Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les

processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination

spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies

comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes

catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute

de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et

reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres

plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le

vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est

relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)

En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre

proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun

complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un

contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association

entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence

drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines

individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe

proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines

obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes

2

est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux

sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie

(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une

cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple

(14-16)

Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-

proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau

drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour

comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute

Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment

lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)

Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces

cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en

consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex

environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles

ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en

perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur

lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel

environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer

une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine

Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la

meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la

comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du

trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees

preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce

complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et

controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et

collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite

divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation

3

de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet

drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa

le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute

systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le

reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le

recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces

complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait

fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait

complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines

essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues

pour la robustesse (31)

Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux

meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe

proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber

seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a

deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major

Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les

infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les

bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe

srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies

mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait

responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les

reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes

perturbations (36)

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions

proteacuteine-proteacuteine

Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont

eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent

toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-

ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre

4

diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition

des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions

physiques entre deux proteacuteines

La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique

soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la

spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de

meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-

hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment

complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est

tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal

lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte

eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes

(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo

(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui

permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans

qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc

agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces

meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent

des informations sur les relations spatiales entre les proteacuteines

Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun

cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles

maintiennent ensemble

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification

de complexes proteacuteiques suivie de la spectromeacutetrie de masse

La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une

meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs

techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie

drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun

extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par

un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees

5

afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite

les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides

et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison

des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines

retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de

masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et

fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre

additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe

drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la

seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal

inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de

leur eacutetude (43)

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques

1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de

fragments proteacuteiques

La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments

rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les

deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs

srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal

Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute

permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique

Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du

galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de

plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune

part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins

des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines

membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part

puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre

localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines

Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle

6

neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est

par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou

lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore

largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain

dans un modegravele plus simple (51)

En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave

laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les

proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute

activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-

ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires

insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la

Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-

50 52 53)

La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct

que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute

cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements

deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments

rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme

proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves

lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme

rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la

cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance

cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN

(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-

agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique

possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines

eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la

localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene

ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)

Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 10: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

X

NaCl Chlorure de sodium

NMR Reacutesonance magneacutetique nucleacuteaire

OD Densiteacute optique

PBS Tampon phosphate salin

PCA Compleacutementation de fragments proteacuteiques

PCR Reacuteaction en chaicircne de polymeacuterisation

PKA Proteacuteine kinase A

PPI Interaction proteacuteine-proteacuteine

Q1 Quartile 1

Q3 Quartile 3

r Coefficient de correacutelation

RNApol ARN polymeacuterase

Sdb Deacuteviation standard

Ser ou S Seacuterine

SDS Sodium dodeacutecyl sulfate

SDS-PAGE Eacutelectrophoregravese en gel de polyacrylamide contenant du sodium dodeacutecyl sulfate

t-test Test de Student

YPD Extrait de levures peptone dextrose

Y2H Double hybride

Zs Score Z

microb Moyenne estimeacutee

microg Microgramme

microL Microlitre

microM Micromolaire

2YT 2 extraits de levures tryptone

2xL Connecteur contenant 2 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

3xL Connecteur contenant 3 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

4xL Connecteur contenant 4 reacutepeacutetitions du motif Gly-Gly-Gly-Gly-Ser

XI

Remerciements

Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens

sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon

directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur

de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les

moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes

de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert

des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles

Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr

Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet

Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux

professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la

science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur

disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite

eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur

excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son

entraide sa disponibiliteacute et les discussions entraicircnantes

Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes

supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme

un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une

compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les

rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant

au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave

Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son

sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et

Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller

lors de difficulteacutes

XII

Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis

Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni

non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes

mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire

de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des

responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu

deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un

support moral

Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc

est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son

savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout

moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes

craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour

deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et

professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours

reconnaissante Merci mon amour merci pour tout

XIII

Avant-propos

Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui

sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant

de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application

pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec

Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute

(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de

ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au

baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La

reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et

Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites

conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme

Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee

dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of

protein interactomes reveals their mechanisms of regulation robustness and insights into

genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau

(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette

(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger

(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce

meacutemoire

1

Introduction geacuteneacuterale

11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine

Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du

vivant Leurs associations temporaires ou permanentes sont au cœur des voies de

signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent

interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les

interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les

interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la

cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le

maintien des fonctions cellulaires

Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les

processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination

spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies

comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes

catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute

de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et

reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres

plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le

vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est

relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)

En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre

proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun

complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un

contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association

entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence

drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines

individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe

proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines

obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes

2

est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux

sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie

(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une

cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple

(14-16)

Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-

proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau

drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour

comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute

Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment

lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)

Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces

cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en

consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex

environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles

ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en

perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur

lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel

environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer

une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine

Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la

meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la

comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du

trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees

preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce

complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et

controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et

collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite

divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation

3

de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet

drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa

le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute

systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le

reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le

recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces

complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait

fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait

complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines

essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues

pour la robustesse (31)

Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux

meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe

proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber

seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a

deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major

Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les

infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les

bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe

srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies

mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait

responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les

reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes

perturbations (36)

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions

proteacuteine-proteacuteine

Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont

eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent

toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-

ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre

4

diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition

des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions

physiques entre deux proteacuteines

La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique

soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la

spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de

meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-

hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment

complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est

tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal

lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte

eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes

(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo

(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui

permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans

qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc

agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces

meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent

des informations sur les relations spatiales entre les proteacuteines

Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun

cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles

maintiennent ensemble

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification

de complexes proteacuteiques suivie de la spectromeacutetrie de masse

La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une

meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs

techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie

drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun

extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par

un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees

5

afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite

les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides

et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison

des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines

retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de

masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et

fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre

additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe

drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la

seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal

inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de

leur eacutetude (43)

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques

1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de

fragments proteacuteiques

La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments

rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les

deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs

srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal

Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute

permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique

Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du

galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de

plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune

part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins

des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines

membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part

puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre

localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines

Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle

6

neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est

par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou

lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore

largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain

dans un modegravele plus simple (51)

En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave

laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les

proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute

activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-

ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires

insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la

Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-

50 52 53)

La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct

que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute

cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements

deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments

rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme

proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves

lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme

rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la

cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance

cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN

(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-

agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique

possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines

eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la

localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene

ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)

Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 11: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

XI

Remerciements

Lrsquoaccomplissement de ce projet a neacutecessiteacute lrsquoaide de plusieurs personnes que je tiens

sincegraverement agrave remercier Tout drsquoabord je me dois de remercier Dr Christian Landry mon

directeur de maicirctrise Christian mrsquoa encourageacutee tout au long de ce peacuteriple agrave donner le meilleur

de moi-mecircme tant scientifiquement que collectivement Il a non seulement su me donner les

moyens mateacuteriels de le faire mais il a eacutegalement su me montrer que je posseacutedais les capaciteacutes

de le faire Christian est un directeur tregraves preacutesent et disponible pour ses eacutetudiants Il mrsquoa offert

des opportuniteacutes et mrsquoa appuyeacutee pour chacune drsquoelles

Je voudrais aussi remercier les membres de mon comiteacute aviseur Dr Yves Bourbonnais et Dr

Nicolas Bisson pour leurs conseils et le temps qursquoils mrsquoont consacreacute dans ce projet

Jrsquoaimerais eacutegalement remercier Isabelle Gagnon-Arsenault et Alexandre K Dubeacute les deux

professionnels de recherche du laboratoire Leur grande expertise et leur passion pour la

science sont un pilier dans cette eacutequipe Sans leurs preacutecieux conseils leur deacutevotion et leur

disponibiliteacute la reacutealisation de ce projet aurait eacuteteacute particuliegraverement ardue Je souhaite

eacutegalement remercier mes collaborateurs Xavier Barbeau et Patrick Laguumle Gracircce agrave leur

excellent travail mon meacutemoire srsquoen trouve bonifieacute Un merci particulier agrave Xavier pour son

entraide sa disponibiliteacute et les discussions entraicircnantes

Je crois qursquoil est important de remercier tous les membres du laboratoire Landry Les eacutetudes

supeacuterieures demandent de passer beaucoup de temps dans le laboratoire qui devient comme

un second foyer De lagrave provient lrsquoimportance de partager des fous rires et de cultiver une

compliciteacute avec ses membres Je voudrais tous les remercier pour les bavardages et les

rigolades aux fameux laquo tea break raquo les discussions animeacutees et eacutevidement le support autant

au laboratoire que moralement Merci agrave Claudine pour lrsquoeacuteteacute partageacute ensemble agrave Lou et agrave

Eacuteleacuteonore pour leur aide avec la programmation agrave Anne-Marie pour sa collaboration et son

sourire ainsi qursquoagrave Marie pour ses conseils en analyse Un merci tout speacutecial agrave Guillaume et

Heacutelegravene qui ont particuliegraverement su mrsquoaccrocher un sourire ou mrsquoappuyer et me conseiller

lors de difficulteacutes

XII

Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis

Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni

non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes

mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire

de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des

responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu

deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un

support moral

Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc

est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son

savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout

moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes

craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour

deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et

professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours

reconnaissante Merci mon amour merci pour tout

XIII

Avant-propos

Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui

sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant

de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application

pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec

Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute

(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de

ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au

baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La

reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et

Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites

conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme

Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee

dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of

protein interactomes reveals their mechanisms of regulation robustness and insights into

genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau

(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette

(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger

(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce

meacutemoire

1

Introduction geacuteneacuterale

11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine

Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du

vivant Leurs associations temporaires ou permanentes sont au cœur des voies de

signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent

interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les

interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les

interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la

cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le

maintien des fonctions cellulaires

Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les

processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination

spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies

comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes

catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute

de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et

reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres

plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le

vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est

relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)

En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre

proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun

complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un

contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association

entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence

drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines

individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe

proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines

obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes

2

est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux

sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie

(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une

cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple

(14-16)

Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-

proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau

drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour

comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute

Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment

lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)

Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces

cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en

consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex

environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles

ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en

perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur

lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel

environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer

une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine

Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la

meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la

comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du

trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees

preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce

complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et

controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et

collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite

divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation

3

de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet

drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa

le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute

systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le

reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le

recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces

complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait

fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait

complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines

essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues

pour la robustesse (31)

Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux

meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe

proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber

seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a

deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major

Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les

infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les

bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe

srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies

mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait

responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les

reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes

perturbations (36)

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions

proteacuteine-proteacuteine

Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont

eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent

toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-

ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre

4

diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition

des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions

physiques entre deux proteacuteines

La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique

soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la

spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de

meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-

hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment

complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est

tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal

lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte

eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes

(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo

(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui

permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans

qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc

agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces

meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent

des informations sur les relations spatiales entre les proteacuteines

Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun

cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles

maintiennent ensemble

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification

de complexes proteacuteiques suivie de la spectromeacutetrie de masse

La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une

meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs

techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie

drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun

extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par

un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees

5

afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite

les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides

et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison

des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines

retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de

masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et

fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre

additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe

drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la

seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal

inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de

leur eacutetude (43)

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques

1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de

fragments proteacuteiques

La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments

rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les

deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs

srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal

Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute

permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique

Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du

galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de

plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune

part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins

des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines

membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part

puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre

localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines

Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle

6

neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est

par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou

lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore

largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain

dans un modegravele plus simple (51)

En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave

laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les

proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute

activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-

ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires

insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la

Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-

50 52 53)

La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct

que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute

cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements

deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments

rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme

proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves

lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme

rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la

cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance

cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN

(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-

agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique

possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines

eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la

localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene

ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)

Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 12: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

XII

Il est aussi important de remercier mes parents mais eacutegalement toute ma famille et mes amis

Mes parents mrsquoont toujours encourageacutee agrave me reacutealiser et agrave aimer mon travail Ils mrsquoont fourni

non seulement un cadre ideacuteal pour atteindre mes objectifs durant lrsquoensemble de mes eacutetudes

mais ils mrsquoont aussi offert leur soutien moral et mrsquoont inculqueacute lrsquoimportance de toujours faire

de son mieux Les valeurs qursquoils mrsquoont transmises mrsquoont permis drsquoavoir un grand sens des

responsabiliteacutes drsquohonnecircteteacute et drsquoimplication Gracircce agrave ma famille et mes amis jrsquoai pu

deacutecompresser simplement mrsquoamuser et me vider le cœur de temps en temps Ils ont eacuteteacute un

support moral

Enfin je tiens agrave remercier du plus profond de mon cœur mon conjoint Marc Beacutelanger Marc

est une personne incroyablement geacuteneacutereuse geacuteneacutereuse de son temps de son eacutecoute de son

savoir et de ses passions Il a eacuteteacute drsquoun appui inestimable durant ce parcours et ce agrave tout

moment Ses encouragements son eacutepaule ses mouchoirs et sa compreacutehension ont apaiseacute mes

craintes et mes chagrins Il eacutetait aussi lagrave pour ceacuteleacutebrer les reacuteussites Je nrsquoai aucun mot pour

deacutecrire agrave quel point cette personne mrsquoa apporteacute personnellement humainement et

professionnellement Marc a fait de moi une personne meilleure et je lui en serai toujours

reconnaissante Merci mon amour merci pour tout

XIII

Avant-propos

Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui

sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant

de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application

pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec

Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute

(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de

ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au

baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La

reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et

Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites

conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme

Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee

dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of

protein interactomes reveals their mechanisms of regulation robustness and insights into

genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau

(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette

(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger

(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce

meacutemoire

1

Introduction geacuteneacuterale

11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine

Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du

vivant Leurs associations temporaires ou permanentes sont au cœur des voies de

signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent

interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les

interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les

interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la

cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le

maintien des fonctions cellulaires

Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les

processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination

spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies

comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes

catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute

de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et

reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres

plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le

vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est

relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)

En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre

proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun

complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un

contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association

entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence

drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines

individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe

proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines

obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes

2

est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux

sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie

(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une

cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple

(14-16)

Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-

proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau

drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour

comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute

Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment

lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)

Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces

cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en

consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex

environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles

ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en

perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur

lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel

environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer

une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine

Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la

meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la

comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du

trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees

preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce

complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et

controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et

collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite

divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation

3

de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet

drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa

le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute

systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le

reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le

recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces

complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait

fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait

complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines

essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues

pour la robustesse (31)

Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux

meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe

proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber

seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a

deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major

Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les

infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les

bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe

srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies

mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait

responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les

reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes

perturbations (36)

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions

proteacuteine-proteacuteine

Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont

eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent

toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-

ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre

4

diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition

des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions

physiques entre deux proteacuteines

La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique

soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la

spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de

meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-

hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment

complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est

tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal

lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte

eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes

(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo

(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui

permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans

qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc

agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces

meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent

des informations sur les relations spatiales entre les proteacuteines

Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun

cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles

maintiennent ensemble

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification

de complexes proteacuteiques suivie de la spectromeacutetrie de masse

La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une

meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs

techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie

drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun

extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par

un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees

5

afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite

les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides

et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison

des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines

retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de

masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et

fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre

additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe

drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la

seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal

inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de

leur eacutetude (43)

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques

1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de

fragments proteacuteiques

La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments

rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les

deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs

srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal

Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute

permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique

Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du

galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de

plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune

part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins

des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines

membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part

puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre

localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines

Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle

6

neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est

par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou

lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore

largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain

dans un modegravele plus simple (51)

En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave

laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les

proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute

activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-

ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires

insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la

Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-

50 52 53)

La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct

que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute

cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements

deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments

rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme

proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves

lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme

rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la

cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance

cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN

(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-

agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique

possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines

eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la

localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene

ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)

Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 13: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

XIII

Avant-propos

Ce meacutemoire comporte un unique chapitre reacutedigeacute sous la forme drsquoun article scientifique qui

sera soumis pour publication Cet article preacutesente lrsquoadaptation de la meacutethode PCA permettant

de deacutetecter des associations entre des proteacuteines eacuteloigneacutees dans lrsquoespace et son application

pour lrsquoeacutetude de complexes proteacuteiques Jrsquoai contribueacute agrave la planification des expeacuteriences avec

Christian R Landry (directeur du projet) Isabelle Gagnon-Arsenault et Alexandre K Dubeacute

(professionnels de recherche) Plusieurs personnes mrsquoincluant ont participeacute agrave lrsquoexeacutecution de

ces expeacuteriences soit Isabelle Gagnon-Arsenault Claudine Lamothe (eacutetudiante au

baccalaureacuteat) Alexandre K Dubeacute et Anne-Marie Dion-Cocircteacute (eacutetudiante au post-doctorat) La

reacutealisation des analyses structurelles a eacuteteacute effectueacutee par Xavier Barbeau (collaborateur) et

Patrick Laguumle (collaborateur) Lrsquoanalyse des reacutesultats et la reacutedaction de lrsquoarticle ont eacuteteacute faites

conjointement par Isabelle Gagnon-Arsenault Christian Landry et moi-mecircme

Durant ce projet jrsquoai eacutegalement contribueacute agrave la reacutedaction drsquoune revue de litteacuterature publieacutee

dans Briefings in functional genomics en mars 2016 sous le titre Multi-scale perturbations of

protein interactomes reveals their mechanisms of regulation robustness and insights into

genotype-phenotype maps Plusieurs personnes ont participeacute agrave la reacutedaction Marie Filteau

(eacutetudiante au post-doctorat) Heacutelegravene Vignaud (eacutetudiante au post-doctorat) Samuel Rochette

(eacutetudiant au doctorat) Guillaume Diss (eacutetudiant au post-doctorat) Caroline M Berger

(eacutetudiante agrave la maicirctrise) et Christian R Landry Cet article nrsquoest pas preacutesenteacute dans ce

meacutemoire

1

Introduction geacuteneacuterale

11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine

Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du

vivant Leurs associations temporaires ou permanentes sont au cœur des voies de

signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent

interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les

interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les

interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la

cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le

maintien des fonctions cellulaires

Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les

processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination

spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies

comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes

catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute

de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et

reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres

plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le

vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est

relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)

En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre

proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun

complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un

contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association

entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence

drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines

individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe

proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines

obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes

2

est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux

sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie

(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une

cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple

(14-16)

Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-

proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau

drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour

comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute

Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment

lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)

Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces

cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en

consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex

environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles

ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en

perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur

lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel

environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer

une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine

Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la

meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la

comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du

trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees

preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce

complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et

controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et

collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite

divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation

3

de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet

drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa

le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute

systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le

reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le

recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces

complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait

fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait

complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines

essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues

pour la robustesse (31)

Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux

meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe

proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber

seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a

deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major

Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les

infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les

bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe

srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies

mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait

responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les

reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes

perturbations (36)

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions

proteacuteine-proteacuteine

Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont

eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent

toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-

ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre

4

diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition

des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions

physiques entre deux proteacuteines

La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique

soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la

spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de

meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-

hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment

complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est

tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal

lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte

eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes

(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo

(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui

permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans

qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc

agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces

meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent

des informations sur les relations spatiales entre les proteacuteines

Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun

cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles

maintiennent ensemble

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification

de complexes proteacuteiques suivie de la spectromeacutetrie de masse

La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une

meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs

techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie

drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun

extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par

un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees

5

afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite

les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides

et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison

des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines

retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de

masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et

fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre

additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe

drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la

seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal

inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de

leur eacutetude (43)

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques

1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de

fragments proteacuteiques

La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments

rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les

deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs

srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal

Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute

permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique

Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du

galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de

plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune

part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins

des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines

membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part

puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre

localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines

Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle

6

neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est

par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou

lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore

largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain

dans un modegravele plus simple (51)

En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave

laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les

proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute

activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-

ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires

insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la

Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-

50 52 53)

La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct

que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute

cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements

deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments

rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme

proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves

lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme

rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la

cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance

cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN

(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-

agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique

possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines

eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la

localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene

ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)

Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 14: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

1

Introduction geacuteneacuterale

11 Lrsquoaspect fondamental des interactions proteacuteine-proteacuteine

Les proteacuteines par leur grande diversiteacute de rocircles sont consideacutereacutees comme la machinerie du

vivant Leurs associations temporaires ou permanentes sont au cœur des voies de

signalisation et de reacutegulation ainsi que des complexes proteacuteiques Les proteacuteines peuvent

interagir entre elles via des forces intermoleacuteculaires comme les liaisons hydrogegravene les

interactions hydrophobes les forces de Van der Waals et les interactions ioniques Les

interactions proteacuteine-proteacuteine (PPI) sont essentielles pour le bon fonctionnement de la

cellule puisqursquoelles interviennent dans tous les processus cellulaires ainsi que dans le

maintien des fonctions cellulaires

Les interactions qui se forment de maniegravere transitoire sont souvent retrouveacutees dans les

processus de signalisation et de reacutegulation Elles neacutecessitent une excellente coordination

spatiotemporelle ce qui explique lors drsquoune mauvaise coordination lrsquoapparition de maladies

comme le cancer (1) Un exemple drsquoassociation transitoire est celui des deux sous-uniteacutes

catalytiques et des deux sous-uniteacutes reacutegulatrices de la proteacuteine kinase A (PKA) (2) Lrsquoactiviteacute

de cette enzyme est reacuteguleacutee par lrsquoassociation et la dissociation des sous-uniteacutes catalytiques et

reacutegulatrices La transition drsquoune forme vers lrsquoautre controcircle chez la levure et les mammifegraveres

plusieurs processus dont le meacutetabolisme eacutenergeacutetique la croissance cellulaire le

vieillissement et la reacuteponse agrave des stimuli (3-7) Une mauvaise reacutegulation de la kinase est

relieacutee chez lrsquohomme agrave des maladies telles que le syndrome de Cushing (8)

En plus des interactions passagegraveres la cellule est le foyer drsquointeractions stables entre

proteacuteines menant ainsi agrave la formation de complexes proteacuteiques Bien que les PPI drsquoun

complexe soient stables il est possible que ce complexe proteacuteique ne se forme que dans un

contexte particulier On peut deacutefinir un complexe proteacuteique comme eacutetant une association

entre deux proteacuteines ou plus (9) Lrsquoassociation entre ces proteacuteines permet lrsquoeacutemergence

drsquoactiviteacutes biologiques additionnelles qui seraient impossibles en consideacuterant les proteacuteines

individuellement Un exemple illustrant tregraves bien ce concept est le proteacuteasome un complexe

proteacuteique impliqueacute dans lrsquohomeacuteostasie des proteacuteines par la deacutegradation des proteacuteines

obsolegravetes marqueacutees par une chaicircne drsquoubiquitine Sa structure conserveacutee chez les eucaryotes

2

est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux

sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie

(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une

cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple

(14-16)

Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-

proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau

drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour

comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute

Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment

lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)

Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces

cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en

consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex

environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles

ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en

perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur

lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel

environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer

une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine

Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la

meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la

comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du

trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees

preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce

complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et

controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et

collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite

divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation

3

de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet

drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa

le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute

systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le

reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le

recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces

complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait

fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait

complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines

essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues

pour la robustesse (31)

Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux

meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe

proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber

seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a

deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major

Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les

infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les

bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe

srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies

mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait

responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les

reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes

perturbations (36)

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions

proteacuteine-proteacuteine

Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont

eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent

toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-

ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre

4

diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition

des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions

physiques entre deux proteacuteines

La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique

soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la

spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de

meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-

hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment

complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est

tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal

lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte

eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes

(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo

(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui

permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans

qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc

agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces

meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent

des informations sur les relations spatiales entre les proteacuteines

Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun

cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles

maintiennent ensemble

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification

de complexes proteacuteiques suivie de la spectromeacutetrie de masse

La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une

meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs

techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie

drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun

extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par

un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees

5

afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite

les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides

et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison

des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines

retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de

masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et

fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre

additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe

drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la

seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal

inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de

leur eacutetude (43)

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques

1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de

fragments proteacuteiques

La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments

rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les

deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs

srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal

Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute

permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique

Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du

galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de

plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune

part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins

des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines

membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part

puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre

localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines

Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle

6

neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est

par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou

lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore

largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain

dans un modegravele plus simple (51)

En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave

laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les

proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute

activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-

ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires

insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la

Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-

50 52 53)

La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct

que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute

cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements

deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments

rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme

proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves

lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme

rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la

cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance

cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN

(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-

agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique

possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines

eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la

localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene

ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)

Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 15: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

2

est composeacutee drsquoun sous-complexe catalytique en forme de tonneau encadreacute par un ou deux

sous-complexes reacutegulateurs Elle compte 33 proteacuteines preacutesentes parfois en plus drsquoune copie

(10-13) Eacutetant donneacute son importance dans le recyclage des proteacuteines le proteacuteasome est une

cible inteacuteressante pour combattre le cancer et les maladies neurodeacutegeacuteneacuteratives par exemple

(14-16)

Les deux exemples preacuteceacutedents deacutemontrent bien le rocircle primordial des associations proteacuteine-

proteacuteine Neacuteanmoins ils ne repreacutesentent qursquoune infime partie drsquoun grand reacuteseau

drsquointeractions beaucoup plus eacutelaboreacute La cartographie des reacuteseaux de PPI est essentielle pour

comprendre lrsquoorganisation le fonctionnement et la viabiliteacute cellulaire drsquoun organisme donneacute

Le reacuteseau de PPI a eacuteteacute cartographieacute agrave grande eacutechelle pour plusieurs organismes notamment

lrsquohumain (17) Saccharomyces cerevisiae (18-20) Drosophila melanogaster (21)

Caenorhabditis elegans (22) plusieurs bacteacuteries (23-26) et plusieurs virus (27-29) Ces

cartographies repreacutesentent une image statique du reacuteseau ne prenant pas complegravetement en

consideacuteration la capaciteacute drsquoadaptation de la cellule agrave diffeacuterentes conditions (p ex

environnement cycle cellulaire) Pour pallier cette limite des cartographies additionnelles

ont ensuite eacuteteacute reacutealiseacutees en consideacuterant la dynamique des reacuteseaux drsquointeractions soit en

perturbant les conditions de croissance cellulaire Elles renseignent entre autres sur

lrsquoadaptation ou encore la plasticiteacute drsquoun organisme en preacutesence drsquoun stress ou drsquoun nouvel

environnement Malgreacute cette nouvelle perspective il demeure encore difficile de distinguer

une interaction stable drsquoune interaction transitoire agrave lrsquoaide des cartographies

12 Applications concregravetes de lrsquoeacutetude des interactions proteacuteine-proteacuteine

Lrsquoeacutetude des PPI apporte un nouveau regard sur des domaines tels que lrsquoeacutevolution et la

meacutedecine Il est possible de retracer lrsquohistoire eacutevolutive des complexes proteacuteiques par la

comparaison des PPI comme le deacutemontre lrsquoeacutetude du pore nucleacuteaire de la levure et du

trypanosome (30) Ces deux organismes ayant divergeacute il y a plus de 15 milliard drsquoanneacutees

preacutesentent des ressemblances et des diffeacuterences dans la structure de leur pore nucleacuteaire Ce

complexe proteacuteique essentiel forme un canal dans la membrane du noyau cellulaire et

controcircle le transport de moleacutecules entre le noyau et le cytoplasme Ainsi Obado et

collaborateurs ont identifieacute la partie ancestrale du pore nucleacuteaire et celle ayant ensuite

divergeacute Les diffeacuterences dans la structure expliquent les meacutecanismes distincts drsquoexportation

3

de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet

drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa

le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute

systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le

reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le

recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces

complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait

fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait

complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines

essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues

pour la robustesse (31)

Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux

meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe

proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber

seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a

deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major

Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les

infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les

bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe

srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies

mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait

responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les

reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes

perturbations (36)

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions

proteacuteine-proteacuteine

Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont

eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent

toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-

ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre

4

diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition

des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions

physiques entre deux proteacuteines

La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique

soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la

spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de

meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-

hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment

complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est

tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal

lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte

eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes

(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo

(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui

permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans

qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc

agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces

meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent

des informations sur les relations spatiales entre les proteacuteines

Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun

cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles

maintiennent ensemble

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification

de complexes proteacuteiques suivie de la spectromeacutetrie de masse

La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une

meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs

techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie

drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun

extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par

un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees

5

afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite

les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides

et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison

des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines

retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de

masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et

fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre

additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe

drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la

seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal

inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de

leur eacutetude (43)

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques

1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de

fragments proteacuteiques

La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments

rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les

deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs

srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal

Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute

permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique

Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du

galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de

plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune

part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins

des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines

membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part

puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre

localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines

Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle

6

neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est

par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou

lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore

largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain

dans un modegravele plus simple (51)

En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave

laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les

proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute

activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-

ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires

insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la

Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-

50 52 53)

La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct

que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute

cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements

deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments

rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme

proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves

lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme

rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la

cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance

cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN

(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-

agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique

possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines

eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la

localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene

ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)

Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 16: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

3

de lrsquoARNm chez les deux organismes (30) En outre la perturbation des PPI permet

drsquoeacutelucider la robustesse drsquoun complexe proteacuteique aux mutations crsquoest-agrave-dire la capaciteacute qursquoa

le complexe de fonctionner en deacutepit de la perturbation Diss et collaborateurs ont deacuteleacuteteacute

systeacutematiquement les gegravenes codant pour les proteacuteines retrouveacutees dans le pore nucleacuteaire et le

reacutetromegravere (31) Le reacutetromegravere est un complexe proteacuteique non essentiel qui a pour fonction le

recyclage de reacutecepteurs membranaires En analysant les interactions preacutesentes dans ces

complexes apregraves chaque perturbation les auteurs ont observeacute que le pore nucleacuteaire demeurait

fonctionnel malgreacute la perte de certaines proteacuteines alors que le reacutetromegravere se dissociait

complegravetement apregraves la perte drsquoune proteacuteine Ils sont ainsi parvenus agrave identifier les proteacuteines

essentielles pour lrsquoassemblage de ces complexes et agrave deacutemontrer lrsquoimportance des paralogues

pour la robustesse (31)

Dans le domaine meacutedical lrsquoeacutetude des PPI a largement eacuteteacute utiliseacutee pour deacutecouvrir de nouveaux

meacutedicaments (32-34) De plus lrsquoidentification des diffeacuterences structurales drsquoun complexe

proteacuteique entre deux organismes peut fournir des cibles inteacuteressantes pour inhiber

seacutelectivement le complexe drsquoun organisme Tregraves reacutecemment un groupe de recherche a

deacuteveloppeacute un inhibiteur qui cible le proteacuteasome de Leishmania donovani Leishmania major

Trypanosoma cruzi et Trypanosoma brucei ce qui permettra eacuteventuellement de traiter les

infections causeacutees par ces parasites (35) Les PPI permettent eacutegalement de comprendre les

bases geacuteneacutetiques des maladies comme lrsquoont deacutemontreacute Sahni et collaborateurs Cette eacutequipe

srsquoest inteacuteresseacutee agrave pregraves de 3000 mutations retrouveacutees dans un spectre de maladies

mendeacuteliennes Dans pregraves de 60 des cas la perturbation des reacuteseaux drsquointeractions eacutetait

responsable des maladies agrave lrsquoeacutetude soit en affectant partiellement ou complegravetement les

reacuteseaux Par ailleurs diffeacuterentes mutations dans un mecircme gegravene entraicircnent diffeacuterentes

perturbations (36)

13 Cateacutegories de meacutethodes permettant drsquoeacutetudier les interactions

proteacuteine-proteacuteine

Eacutetant donneacute lrsquoimportance des reacuteseaux de PPI en biologie cellulaire plusieurs meacutethodes ont

eacuteteacute deacuteveloppeacutees pour les eacutetudier Ces meacutethodes sont compleacutementaires puisqursquoelles possegravedent

toutes des avantages et des limites qui ne leur permettent de cibler que diffeacuterents sous-

ensembles du reacuteseau drsquointeractions (37) Malgreacute tout lrsquoensemble des meacutethodes peut ecirctre

4

diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition

des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions

physiques entre deux proteacuteines

La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique

soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la

spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de

meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-

hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment

complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est

tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal

lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte

eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes

(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo

(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui

permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans

qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc

agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces

meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent

des informations sur les relations spatiales entre les proteacuteines

Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun

cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles

maintiennent ensemble

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification

de complexes proteacuteiques suivie de la spectromeacutetrie de masse

La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une

meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs

techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie

drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun

extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par

un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees

5

afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite

les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides

et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison

des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines

retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de

masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et

fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre

additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe

drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la

seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal

inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de

leur eacutetude (43)

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques

1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de

fragments proteacuteiques

La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments

rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les

deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs

srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal

Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute

permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique

Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du

galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de

plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune

part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins

des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines

membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part

puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre

localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines

Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle

6

neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est

par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou

lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore

largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain

dans un modegravele plus simple (51)

En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave

laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les

proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute

activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-

ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires

insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la

Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-

50 52 53)

La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct

que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute

cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements

deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments

rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme

proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves

lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme

rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la

cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance

cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN

(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-

agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique

possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines

eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la

localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene

ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)

Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 17: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

4

diviseacute en deux principales cateacutegories les meacutethodes permettant de deacuteterminer la composition

des complexes proteacuteiques et les meacutethodes permettant de deacuteterminer les interactions

physiques entre deux proteacuteines

La premiegravere cateacutegorie inclut les meacutethodes qui permettent de purifier un complexe proteacuteique

soit par chromatographies drsquoaffiniteacute ou de seacuteparation pour ensuite lrsquoanalyser agrave lrsquoaide de la

spectromeacutetrie de masse (MS) La seconde cateacutegorie regroupe une grande diversiteacute de

meacutethodes dont la double hybride (Y2H laquo yeast two-hybrid raquo) le laquo membrane yeast two-

hybrid raquo (MYTH) et la compleacutementation de fragments proteacuteiques (PCA laquo protein-fragment

complementation assay raquo) Le principe des meacutethodes appartenant agrave la deuxiegraveme cateacutegorie est

tregraves similaire et se base sur la reconstitution drsquoun rapporteur fonctionnel qui eacutemet un signal

lorsque les deux proteacuteines interagissent physiquement La seconde cateacutegorie compte

eacutegalement trois meacutethodes hybrides le transfert drsquoeacutenergie entre moleacutecules fluorescentes

(FRET) le laquo cross-linking raquo suivi de la MS et le laquo proximity-dependent biotinylation raquo

(BioID) Dans ce contexte lrsquoexpression laquo meacutethode hybride raquo signifie des meacutethodes qui

permettent de deacutetecter des associations entre proteacuteines rapprocheacutees dans lrsquoespace sans

qursquoelles ne soient neacutecessairement des interactions physiques Ces meacutethodes possegravedent donc

agrave la fois les caracteacuteristiques des deux cateacutegories de meacutethodes Dans le cadre de ce projet ces

meacutethodes sont consideacutereacutees comme faisant partie de la seconde cateacutegorie car elles donnent

des informations sur les relations spatiales entre les proteacuteines

Les deux cateacutegories de meacutethodes sont compleacutementaires car elles permettent de deacutefinir drsquoun

cocircteacute les composantes drsquoun complexe proteacuteique et drsquoun autre cocircteacute les relations qursquoelles

maintiennent ensemble

131 Meacutethodes identifiant les membres drsquoun complexe proteacuteique Purification

de complexes proteacuteiques suivie de la spectromeacutetrie de masse

La purification de complexes proteacuteiques et lrsquoidentification des composantes par MS est une

meacutethode ayant pour but drsquoisoler un complexe proteacuteique et drsquoidentifier ses membres Plusieurs

techniques sont utiliseacutees pour purifier les complexes proteacuteiques dont la chromatographie

drsquoaffiniteacute La chromatographie drsquoaffiniteacute seacutepare une proteacuteine drsquointeacuterecirct et ses interactants drsquoun

extrait proteacuteique agrave lrsquoaide drsquoun eacutepitope speacutecifique agrave cette proteacuteine Cet eacutepitope est reconnu par

un anticorps lieacute agrave la colonne de purification Plusieurs purifications peuvent ecirctre effectueacutees

5

afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite

les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides

et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison

des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines

retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de

masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et

fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre

additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe

drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la

seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal

inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de

leur eacutetude (43)

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques

1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de

fragments proteacuteiques

La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments

rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les

deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs

srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal

Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute

permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique

Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du

galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de

plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune

part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins

des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines

membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part

puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre

localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines

Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle

6

neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est

par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou

lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore

largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain

dans un modegravele plus simple (51)

En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave

laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les

proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute

activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-

ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires

insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la

Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-

50 52 53)

La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct

que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute

cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements

deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments

rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme

proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves

lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme

rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la

cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance

cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN

(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-

agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique

possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines

eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la

localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene

ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)

Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 18: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

5

afin de diminuer les interactions non speacutecifiques qui occasionnent un bruit de fond Ensuite

les proteacuteines isoleacutees sont digeacutereacutees en peptides Le spectromegravetre de masse ionise ces peptides

et les seacutepare selon leur ratio masse-charge reacutesultant en un spectre de masse La comparaison

des profils obtenus avec ceux drsquoune banque de donneacutees permet drsquoidentifier les proteacuteines

retrouveacutees dans le complexe (38-40) Il est eacutegalement possible de faire une spectromeacutetrie de

masse en tandem (MSMS) Agrave partir drsquoune premiegravere MS un peptide est seacutelectionneacute et

fragmenteacute et une nouvelle spectromeacutetrie est reacutealiseacutee avec les fragments reacutesultants Ce spectre

additionnel permet drsquoobtenir davantage drsquoinformations sur ce peptide (41 42) Il existe

drsquoautres techniques de purification telles que la chromatographie drsquoexclusion steacuterique ougrave la

seacuteparation repose sur la taille des complexes proteacuteiques Cette purification a pour principal

inteacuterecirct de permettre drsquoisoler lrsquoensemble des complexes proteacuteiques drsquoun organisme en vue de

leur eacutetude (43)

132 Meacutethodes deacuteterminant le reacuteseau drsquointeractions proteacuteiques

1321 La double hybride le laquo membrane yeast two-hybrid raquo et la compleacutementation de

fragments proteacuteiques

La Y2H le MYTH et la PCA sont des techniques baseacutees sur lrsquoassemblage de fragments

rapporteurs compleacutementaires lieacutes aux deux proteacuteines drsquointeacuterecirct via un connecteur Lorsque les

deux proteacuteines drsquointeacuterecirct interagissent physiquement les deux fragments rapporteurs

srsquoassemblent reconstituant ainsi un rapporteur fonctionnel qui permet de deacutetecter un signal

Dans le cas de la Y2H le rapporteur est un facteur de transcription qui lorsque reconstitueacute

permet la croissance de la levure S cerevisiae sur un milieu de seacutelection speacutecifique

Initialement le facteur de transcription eacutetait Gal4p et le milieu de seacutelection contenait du

galactose (44) La Y2H a eacuteteacute une meacutethode pionniegravere qui a permis le deacuteveloppement de

plusieurs autres meacutethodes Par contre cette technique preacutesente quelques limitations Drsquoune

part dans le cas de la Y2H classique les proteacuteines eacutetudieacutees doivent ecirctre solubles Neacuteanmoins

des variations ont eacuteteacute apporteacutees agrave cette meacutethode pour permettre lrsquoeacutetude de proteacuteines

membranaires (45-47) Cette meacutethode sera le sujet du paragraphe suivant Drsquoautre part

puisque le rapporteur est un facteur de transcription les interactions testeacutees doivent ecirctre

localiseacutees dans le noyau modifiant possiblement la localisation endogegravene des proteacuteines

Cette technique est aussi peu sensible preacutesente du bruit de fond et nrsquoest pas quantitative Elle

6

neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est

par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou

lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore

largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain

dans un modegravele plus simple (51)

En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave

laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les

proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute

activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-

ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires

insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la

Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-

50 52 53)

La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct

que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute

cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements

deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments

rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme

proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves

lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme

rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la

cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance

cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN

(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-

agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique

possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines

eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la

localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene

ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)

Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 19: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

6

neacutecessite souvent la surexpression des proteacuteines ce qui peut geacuteneacuterer des faux-positifs Il est

par conseacutequent impossible drsquoeacutetablir des liens entre lrsquoabondance drsquoune proteacuteine et la force ou

lrsquoabondance drsquoune interaction entre proteacuteines (48-50) Malgreacute ces contraintes elle est encore

largement utiliseacutee parce qursquoelle permet drsquoeacutetudier les PPI drsquoune autre espegravece comme lrsquohumain

dans un modegravele plus simple (51)

En ce qui a trait au MYTH les deux fragments rapporteurs sont une ubiquitine muteacutee agrave

laquelle est lieacute un facteur de transcription En preacutesence drsquoune interaction physique entre les

proteacuteines drsquointeacuterecirct le facteur de transcription lieacute sur lrsquoubiquitine reconstitueacutee est libeacutereacute

activant ainsi la transcription drsquoun gegravene rapporteur Les meacutethodes baseacutees sur le laquo split-

ubiquitin raquo ont permis de grandes avanceacutees dans lrsquoeacutetude des proteacuteines membranaires

insolubles et hors du noyau Par contre le MYTH partage certains inconveacutenients avec la

Y2H comme lrsquoimportance du bruit de fond et lrsquoimpossibiliteacute de quantifier les reacutesultats (47-

50 52 53)

La PCA est une meacutethode similaire aux deux meacutethodes deacutecrites preacuteceacutedemment mais plutocirct

que drsquoutiliser un facteur de transcription comme rapporteur elle utilise une proteacuteine qui a eacuteteacute

cliveacutee en deux fragments Le choix du rapporteur et du lieu de clivage ont eacuteteacute des eacuteleacutements

deacuteterminants dans la conception de la meacutethode Par ailleurs puisque les fragments

rapporteurs proviennent drsquoune seule proteacuteine plutocirct que de deux sous-uniteacutes drsquoune mecircme

proteacuteine ils nrsquoont pas tendance agrave interagir ensemble spontaneacutement agrave moins drsquoecirctre tregraves pregraves

lrsquoun de lrsquoautre ce qui diminue le bruit de fond (54) Chez la levure la PCA utilise comme

rapporteur une version muteacutee de lrsquoenzyme dihydrofolate reacuteductase (DHFR) confeacuterant agrave la

cellule une reacutesistance au meacutethotrexate (MTX) Cette enzyme est essentielle agrave la croissance

cellulaire et intervient notamment dans les reacuteactions de synthegravese de certaines bases de lrsquoADN

(les purines et la thymine) Chez la levure le signal observeacute est la densiteacute de cellules crsquoest-

agrave-dire le nombre de cellules ayant reacuteussi agrave croicirctre sur le milieu de seacutelection Cette technique

possegravede lrsquoavantage drsquoecirctre quantitative en plus de conserver le promoteur naturel des proteacuteines

eacutetudieacutees (48 55 56) Par ailleurs les reacutesultats obtenus par la PCA suggegraverent que la

localisation cellulaire des proteacuteines est conserveacutee En effet il existe un enrichissement laquo gene

ontology raquo pour plusieurs proteacuteines connues partageant la mecircme localisation cellulaire (55)

Par contre il nrsquoest pas impossible qursquoun changement de localisation puisse se produire eacutetant

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 20: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

7

donneacute que les fragments rapporteurs sont ajouteacutes du cocircteacute C-terminal ce qui pourrait interfeacuterer

avec la seacutequence signal de localisation des proteacuteines (57)

Un des inconveacutenients majeurs pour la majoriteacute de ces techniques deacutecoule de lrsquoajout de

fragments rapporteurs qui peuvent affecter le repliement la fonction cellulaire ou

lrsquoabondance de la proteacuteine Par contre lrsquoajout drsquoun connecteur reacuteduit souvent ces risques en

eacuteloignant le fragment rapporteur de la proteacuteine agrave laquelle il est attacheacute ce qui reacuteduit

lrsquointerfeacuterence entre les deux proteacuteines Il peut ecirctre neacutecessaire drsquooptimiser sa composition ou

sa longueur Il existe trois cateacutegories de connecteurs soit les connecteurs flexibles les

connecteurs rigides et les connecteurs clivables in vivo Les connecteurs flexibles sont

geacuteneacuteralement utiliseacutes lorsqursquoil est souhaitable drsquoavoir une certaine mobiliteacute entre la proteacuteine

drsquointeacuterecirct et le fragment rapporteur Les connecteurs rigides permettent une meilleure

seacuteparation entre la proteacuteine drsquointeacuterecirct et le fragment rapporteur et assurent que les fonctions

de chaque eacuteleacutement soient maintenues Ils sont surtout utiles dans les cas ougrave le connecteur

flexible est insuffisant pour bien seacuteparer les deux eacuteleacutements ou qursquoil interfegravere avec lrsquoactiviteacute

de la proteacuteine Les connecteurs clivables in vivo permettent la libeacuteration du fragment

rapporteur sous certaines conditions Ils sont particuliegraverement inteacuteressants pour permettre agrave

chaque eacuteleacutement de reacutealiser une activiteacute biologique qui lui est propre Par conseacutequent il est

essentiel de bien choisir le connecteur et ses paramegravetres pour obtenir les reacutesultats escompteacutes

(58 59)

1322 Meacutethodes hybrides

Bien que classeacutes dans la deuxiegraveme cateacutegorie de meacutethodes le FRET le laquo cross-linking raquo suivi

de la MS et le BioID sont des meacutethodes hybrides qui permettent de mesurer agrave plus faible

reacutesolution les associations proteacuteine-proteacuteine

Le FRET repose sur le transfert drsquoeacutenergie entre deux proteacuteines fluorescentes agrave proximiteacute

lrsquoune de lrsquoautre Les deux proteacuteines fluorescentes sont fusionneacutees aux deux proteacuteines dont on

veut veacuterifier la proximiteacute Lrsquoexcitation de la proteacuteine fluorescente donneuse permet

lrsquoexcitation de la proteacuteine fluorescente receveuse lorsque les deux proteacuteines sont pregraves lrsquoune

de lrsquoautre Lrsquointeraction est deacutetecteacutee par microscopie ou par cytomeacutetrie via lrsquoeacutemission de la

proteacuteine fluorescente receveuse Cette meacutethode est particuliegraverement inteacuteressante pour suivre

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 21: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

8

une interaction dans le temps Par contre lrsquoimportance du bruit de fond et le chevauchement

partiel de la fluorescence des deux proteacuteines peuvent nuire agrave lrsquointerpreacutetation des reacutesultats (60-

63)

Le laquo cross-linking raquo suivi de la MS est pratiquement identique aux techniques de purification

et de MS agrave lrsquoexception qursquoavant la purification les proteacuteines sont attacheacutees entre elles par

des liens covalents Ces liens reacutesistent agrave la digestion enzymatique donnant ainsi des

informations structurales sur lrsquoassociation des proteacuteines dans le complexe proteacuteique

Neacuteanmoins le laquo cross-linking raquo complexifie lrsquoanalyse des donneacutees en plus drsquoentraicircner

potentiellement une mauvaise conception de lrsquoarchitecture du complexe proteacuteique Cette

meacutethode est difficilement applicable pour lrsquoeacutetude globale des complexes proteacuteiques (64-67)

Le BioID utilise la biotinylation pour marquer le contact entre la proteacuteine drsquointeacuterecirct et les

proteacuteines agrave proximiteacute La biotinylation est effectueacutee par une biotine ligase mutante

deacutepourvue de speacutecificiteacute fusionneacutee agrave la proteacuteine drsquointeacuterecirct Les interactants ayant un

groupement biotine sur leurs lysines accessibles sont isoleacutes seacutelectivement et identifieacutes par

MS Le BioID permet de deacutetecter des interactions faibles et transitoires en plus des

interactions entre des proteacuteines voisines Toutefois la biotine ligase possegravede une taille

supeacuterieure agrave celle de la laquo green fluorescence protein raquo (GFP) une proteacuteine fluorescente tregraves

utiliseacutee en biologie moleacuteculaire Cette grande taille peut nuire agrave lrsquoactiviteacute de la proteacuteine

drsquointeacuterecirct ou agrave la formation drsquointeractions De plus cette meacutethode nrsquoest pas quantitative (68)

14 Deacutefi actuel dans lrsquoeacutetude des interactions proteacuteine-proteacuteine

Les meacutethodes hybrides deacutecrites ci-dessus sont particuliegraverement inteacuteressantes puisqursquoelles

donnent une vision plus globale du reacuteseau des PPI Elles renseignent sur la proximiteacute des

proteacuteines donnant accegraves agrave une nouvelle eacutechelle moleacuteculaire de reacutesolution difficilement

accessible En plus de leur complexiteacute les techniques existantes demandent des

infrastructures particuliegraveres (eacutequipements et bases de donneacutees) et sont difficilement

applicables agrave grande eacutechelle Le deacuteveloppement de meacutethodes hybrides de plus grande

simpliciteacute et agrave plus grand deacutebit permettrait de mieux deacutefinir lrsquoarchitecture des complexes

proteacuteiques et de leurs sous-complexes agrave une faible reacutesolution moleacuteculaire Elles seraient un

compleacutement aux deux cateacutegories de meacutethodes Ces nouvelles meacutethodes hybrides

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 22: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

9

compenseraient pour les lacunes des meacutethodes de haute reacutesolution moleacuteculaire comme la

cristallographie ou la reacutesonance magneacutetique nucleacuteaire qui deacuteterminent la structure preacutecise

de proteacuteines ou de complexes proteacuteiques En effet elles sont difficilement applicables agrave de

nombreux complexes proteacuteiques et demandent une deacutemarche propre agrave chaque complexe

15 Le connecteur un paramegravetre potentiellement inteacuteressant pour

moduler la deacutetection des interactions proteacuteine-proteacuteine

En raison de sa relative simpliciteacute et du connecteur qui relie les fragments rapporteurs aux

proteacuteines drsquointeacuterecirct la PCA est une meacutethode de choix pour le deacuteveloppement drsquoune meacutethode

hybride Le connecteur est un court segment peptidique soluble et flexible composeacute de deux

reacutepeacutetitions du motif suivant quatre glycines et une seacuterine (GGGGS) Il assure une bonne

flexibiliteacute et une bonne association des fragments rapporteurs dans lrsquoenvironnement

cellulaire En effet la glycine et la seacuterine sont deux petits acides amineacutes lrsquoun neutre et lrsquoautre

basique respectivement Le connecteur relie le fragment rapporteur au C-terminal des

proteacuteines agrave lrsquoeacutetude

La longueur du connecteur applique eacutegalement une certaine contrainte sur la capaciteacute agrave

deacutetecter une interaction ce qui a notamment eacuteteacute observeacute par lrsquoeacutequipe de recherche ayant

deacuteveloppeacute la PCA agrave grande eacutechelle (55) Les auteurs ont remarqueacute en eacutetudiant lrsquoARN

polymeacuterase (RNApol) II et plusieurs autres complexes proteacuteiques qursquoune interaction avait

35 fois plus de chance drsquoecirctre deacutetecteacutee lorsque les C-termini des proteacuteines drsquointeacuterecirct se

situaient agrave une distance infeacuterieure agrave 82 Aring (55) Cette distance correspond agrave la longueur des

deux connecteurs bout agrave bout Par ailleurs une eacutetude preacuteceacutedente avait deacutemontreacute qursquoen

augmentant la longueur du connecteur il eacutetait possible de deacuteterminer la conformation drsquoun

reacutecepteur dimeacuterique (69) Ainsi il est possible de deacutetecter de nouvelles interactions et par le

fait mecircme drsquoobtenir de nouvelles informations structurelles

16 Objectifs de recherche

Les reacutesultats preacuteceacutedents suggegraverent que la longueur du connecteur peut influencer notre

capaciteacute agrave deacutetecter des PPI Lrsquohypothegravese de mes travaux eacutetait que lrsquoaugmentation de la

longueur du connecteur de la DHFR PCA permettrait de deacutetecter des interactions de plus en

plus eacuteloigneacutees dans lrsquoespace ce qui modulerait lrsquoeacutechelle de reacutesolution moleacuteculaire Cette

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 23: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

10

adaptation permettrait alors drsquoobtenir une nouvelle meacutethode hybride qui pourrait aider agrave

deacutefinir les associations proteacuteine-proteacuteine entre complexes et sous-complexes proteacuteiques Le

premier objectif eacutetait de veacuterifier lrsquoimpact geacuteneacuteral de diffeacuterentes longueurs de connecteur sur

la capaciteacute agrave deacutetecter des associations proteacuteine-proteacuteine Pour atteindre cet objectif les

associations proteacuteine-proteacuteine entre 15 proteacuteines retrouveacutees dans sept complexes proteacuteiques

ont eacuteteacute testeacutees avec les proteacuteines retrouveacutees dans ces complexes et leurs interactants connus

Le second objectif eacutetait de veacuterifier lrsquoimpact de lrsquoaugmentation de la longueur du connecteur

sur la compreacutehension de lrsquoarchitecture de complexes proteacuteiques et de leurs sous-complexes

Cinq complexes proteacuteiques diffeacuterents au niveau de leur taille et de leur flexibiliteacute ont eacuteteacute

eacutetudieacutes Il srsquoagit du proteacuteasome des RNApol I II et III et du complexe laquo conserved

oligomeric Golgi raquo (COG) Lrsquoeacutetude a eacuteteacute effectueacutee avec diffeacuterentes combinaisons de

longueurs de connecteurs Le dernier objectif eacutetait de veacuterifier si lrsquoaugmentation de la

longueur des connecteurs permettait de deacutetecter des associations entre des proteacuteines plus

eacuteloigneacutees dans lrsquoespace Pour ce faire les distances ont eacuteteacute calculeacutees entre les proteacuteines

contenues dans les structures du proteacuteasome et elles ont eacuteteacute compareacutees aux reacutesultats

expeacuterimentaux

Cette eacutetude a eacuteteacute effectueacutee en utilisant lrsquoorganisme modegravele eucaryote S cerevisiae En effet

la levure est particuliegraverement inteacuteressante pour plusieurs aspects notamment la disponibiliteacute

de nombreux et puissants outils geacuteneacutetiques sa vitesse de division cellulaire rapide et

lrsquoabondance de donneacutees concernant la structure des complexes proteacuteiques et les PPI Par

ailleurs cet organisme a joueacute un rocircle primordial dans lrsquoavancement des connaissances dans

divers domaines tels que la deacutetermination de la fonction des proteacuteines les reacuteseaux de

reacutegulation lrsquoexpression des gegravenes les reacuteseaux drsquointeractions proteacuteiques et lrsquoeacutetude des

maladies humaines (70)

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 24: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

11

Measuring proximate protein association in living cells using

Protein-fragment complementation assay (PCA)

Reacutesumeacute

La compreacutehension du fonctionnement du systegraveme cellulaire neacutecessite de cataloguer comment

les proteacuteines srsquoassemblent les unes aux autres en complexes et de deacuteterminer leurs

agencements spatiaux Nous avons examineacute le potentiel de la compleacutementation de fragments

proteacuteiques baseacutee sur la dihydrofolate reacuteductase (DHFR PCA) chez la levure pour obtenir les

contraintes structurales de complexes proteacuteiques agrave faible reacutesolution Nous avons montreacute que

lrsquoutilisation de connecteurs peptidiques allongeacutes entre les proteacuteines de fusion et les fragments

DHFR ameacuteliore la deacutetection des interactions proteacuteine-proteacuteine et permet de reacuteveacuteler des

interactions plus distantes dans lrsquoespace Les connecteurs allongeacutes fournissent ainsi un outil

ameacutelioreacute pour deacutetecter et mesurer les interactions proteacuteine-proteacuteine et la proximiteacute proteacuteique

in vivo Nous avons utiliseacute cet outil pour investiguer davantage lrsquoarchitecture des ARN

polymeacuterases du proteacuteasome et du laquo conserved oligomeric Golgi raquo (COG) chez la levure Nos

reacutesultats offrent de nouvelles avenues pour disseacutequer les reacuteseaux proteacuteiques in vivo

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 25: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

12

Abstract

Understanding the function of cellular systems requires to catalogue how proteins assemble

with each other into complexes and to determine their spatial relationships Here we examine

the potential of the yeast Protein-fragment Complementation Assay based on the

dihydrofolate reductase (DHFR PCA) to obtain low-resolution structural restraints on protein

complexes We show that the use of longer peptide linkers between the fusion proteins and

the DHFR fragments significantly improves the detection of protein-protein interactions and

allows to reveal interactions further in space Longer linkers thus provide an enhanced tool

for the detection and measurements of protein-protein interactions and protein proximity in

living cells We use this tool to further investigate the architecture of the RNA polymerases

the proteasome and the conserved oligomeric Golgi (COG) complexes in yeast Our results

open new avenues for the dissection of protein networks in living cells

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 26: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

13

Introduction

Protein-protein interactions (PPIs) are central to all cellular functions and are largely

responsible for translating genotypes into phenotypes (1) Investigations into the organization

of PPI networks have revealed important insights into the evolution of cellular functions (30

31 55 71-73) the robustness of protein complexes to mutations (31 36 74 75) and have

shown how the regulation of protein expression at the transcriptional translational and

posttranslational levels contributes to the diversity of protein complex assemblies (76-80)

Methods used to investigate the organization of PPIs can be grouped into two main categories

based on whether they infer co-complex memberships or detect physical association (81)

The first category includes methods based on protein purification followed by mass-

spectrometry In this case protein assignment to a specific complex is dependent on stable

association among proteins that survive cell lysis and fractionation or affinity purification

(82 83) The majority of PPIs that populate interactome databases derive from such methods

because a single purification leads to the inference of many interactions among the co-

purified proteins Unfortunately very little is known about the structural and context

dependencies of PPIs inferred from co-complex membership because detecting an

association does not provide information on the spatial organization of the complex (84-86)

The second category of methods reports binary or pairwise interactions between proteins and

reveals direct or nearly direct interactions Such methods include the commonly used yeast-

two-hybrid (Y2H) (51) protein-fragment complementation assays (PCAs) (87) and

technologies based on similar principles (52) These methods are potentially complementary

because on the one hand they tell us which proteins assemble into complexes in the cell and

on the other hand how proteins may be physically located relative to one another (84 88)

Despite this recent progress there is still a need for tools that can detect proximate

relationships among proteins in vivo which would complement and further enhance our

ability to infer the relationships among proteins within and between complexes or

subcomplexes Being able to infer such relationships at different levels of resolution in living

cells is key to future development in cell and systems biology because high-resolution

methods such as NMR or X-ray crystallography are not yet amenable to high-throughput

analysis and cannot be applied to all protein types PCA (87 89) may provide the

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 27: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

14

technological advantages required for such an approach by complementing methods

detecting co-complex membership and direct interactions

PCA relies on the fusion of two proteins of interest with fragments of a reporter protein

usually at their C-terminus Upon interaction the two fragments assemble into a functional

protein that acts as a reporter for the association of the two proteins (55 89-94) Proteins are

usually connected to the reporter fragments with a linker of ten amino acids In principle the

length of the linker limits the maximum distance between the proteins for an interaction to

be detectable In the first large-scale study performed using DHFR PCA in yeast it was

shown that distance constraint determined by linker length could affect the ability to detect

PPIs (55) For the RNA polymerase (RNApol) II complex and several other protein

complexes for which the distance between C-termini of proteins could be measured protein

interactions were 35 times more likely to be detected if the C-termini were within less than

82 Aring of each other In addition an earlier study in mammalian cells showed that increasing

linker length of the PCA reporter allows to detect configuration changes in a dimeric

membrane receptor (69) Together these results suggest that linkers of variable sizes could

improve the detection of PPIs and even be used as a ruler to infer albeit roughly distances

between proteins in living cells Here we test the effect of linker size on the ability to detect

PPIs by PCA in living cells using the yeast DHFR PCA

Material and Methods

Yeast

Yeast strains used in this study were constructed (as described below) or are from the Yeast

Protein Interactome Collection (55) They all derive from BY4741 (MATa his3∆ leu2∆

met15∆ ura3∆) and BY4742 (MATα his3∆ leu2∆ lys2∆ ura3∆) background Cells were

grown on YPD medium (1 Yeast Extract 2 Tryptone 2 Glucose and 2 Agar (for

solid medium)) containing 100 microgmL nourseothricin (clonNAT) andor 250 microgmL

hygromycin B (HygB) for transformations and diploid selection For the DHFR PCA

experiment cells were grown on MTX medium (067 Yeast Nitrogen Base without amino

acids and without ammonium sulfate 2 Glucose 25 Noble Agar Drop-out without

adenine methionine and lysine and 200 microgmL methotrexate (MTX) diluted in DMSO)

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 28: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

15

Bacteria

Escherichia coli MC1061 was used for all DNA cloning and propagation steps Cells were

grown on 2YT medium (1 Yeast extract 16 Tryptone 02 Glucose 05 NaCl and

2 Agar (for solid medium)) supplemented with 100 microgmL ampicillin (Amp)

Plasmid construction

Plasmids pAG25-linker-F[12]-ADHterm and pAG32-linker-F[3]-ADHterm were used as

templates to create new plasmids containing DHFR fragments fused to a linker of varying

size Both original plasmids contained the sequence coding for two repetitions of the motif

Gly-Gly-Gly-Gly-Ser (2xL) Additional repetitions of the motif (one for the 3xL and two for

the 4xL) were introduced between the linker present and the DHFR fragments resulting in

plasmids pAG25-3x-linker-F[12]-ADHterm pAG32-3x-linker-F[3]-ADHterm pAG25-4x-

linker-F[12]-ADHterm and pAG32-4x-linker-F[3]-ADHterm The new repetitions were

composed of synonymous codons leading to the same peptide sequence

In order to replace the 2xL from pAG25-linker-DFFR F[12]-ADHterm with the 3xL and

4xL 3xL-DHFR F[12] and 4xL-DHFR F[12] DNA fragments were synthesized and

inserted in the plasmid pUC57 containing flanking BamHI and XbaI restriction sites The

3x4xL-F[12] fragments were then amplified by PCR digested with DpnI and purified The

plasmid pAG25-linker-DHFR F[12]-ADHterm was digested with XbaI and BamHI The

fragment corresponding to the plasmid without the 2xL-DHFR F[12] region was extracted

on gel The fragments and plasmids were assembled by Gibson cloning (95) with an

insertvector ratio of 51 Cloning reactions were transformed in E coli and clones were

selected on 2YT+Amp Finally positive clones were verified and confirmed by double

digestion with XbaI and BamHI and Sanger sequencing

The pAG25-3x4xL-DHFR F[12]-ADHterm plasmids were used as a template to construct

the pAG32-3x4xL-DHFR F[3]-ADHterm plasmids 3xL and 4xL fragments were PCR

amplified from pAG25-3xL-DHFR F[12]-ADHterm and pAG25-4xL-DHFR F[12]-

ADHterm respectively The DHFR F[3] fragment was amplified from pAG32-linker-DHFR

F[3]-ADHterm All PCR reactions were digested with DpnI and purified Plasmid pAG32-

linker-DHFR F[3]-ADHterm was digested with XbaI and BamHI The fragment

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 29: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

16

corresponding to the plasmid without the 2xL-DHFR F[3] region was extracted on gel The

remaining steps were performed as described above for the pAG25-3x4xL-DHFR F[12]-

ADHterm with an insert (linker)insert (DHFR F[3])vector ratio of 441

Strain construction

Strains were constructed in BY4741 and BY4742 for the DHFR F[12] and DHFR F[3]

fusions respectively (Table S1A) All fusions were performed at the 3 end of genes

2x3x4xL-DHFR F[12]F[3] fragments along with the NAT (for DHFR F[12]) or HPH (for

DHFR F[3]) resistance modules (respectively for resistance to clonNAT and HygB) were

amplified by PCR from their respective plasmid with oligonucleotides specific to the gene to

fuse with the DHFR fragments (PCR primer sequences are found in Table S1D) BY4741

and BY4742 competent cells were transformed with the amplified modules following

standard procedures and selection was performed on YPD+clonNAT (DHFR F[12]-tagged

strains) or YPD+HygB (DHFR F[3]-tagged strains) PCR and Sanger sequencing for all

strains confirmed proper DHFR fragment fusions

Estimation of protein abundance

Protein quantification was done for several strains with proteins fused with the 2xL and 4xL

by Western blot These proteins were selected because we could easily assess their abundance

using antibodies tagged against them 20 OD600 of exponentially growing cells were

resuspended in 200 microL of water containing peptidase inhibitors (1 mM PMSF 07 microgmL

Pepstatin A 05 microgmL Leupeptin and 2 microgmL Aprotinin) 425-600 microM of glass beads

(Sigma) were added (01g) and cells were vortexed using a TurboMix attachment (Scientific

Industries Inc) for 5 min After addition of 1 SDS samples were boiled and supernatants

were transferred in a new tube Protein extracts equivalent to 01 OD600 of cells were

separated on 8 (Vps35p) or 10 (Vps5p Vps17p Pep8p Vps29p and Bcy1p) SDS-PAGE

gel and transferred on a nitrocellulose membrane using a TE 77 PWR semi-dry device

(Amersham) After saturation in Odysseyreg Blocking Buffer (PBS) overnight at 4degC

membranes were probed with Rabbit anti-Vps5p anti-Vps17p anti-Vps26p anti-Vps29p

anti-Vps35p (kindly provided by M N J Seaman) (12000) Goat anti-Bcy1p (11000) or

Mouse anti-Actin (as a loading control 15000) in Blocking Buffer + 02 Tween 20 during

2 hours at room temperature After three 10 min washes in PBS + 02 Tween 20

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 30: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

17

membranes were secondly probed with IRDyereg680RD Goat anti-Rabbit IgG (110000)

IRDyereg680RD Donkey anti-Goat IgG (15000) or IRDyereg800CW Goat anti-Mouse IgG

(110000) in Blocking Buffer + 002 SDS + 02 Tween 20 Three washes of 10 min in

PBS + 02 Tween 20 were performed and signal on membranes was detected using

Odysseyreg Fc Imaging System (LI-CORreg) Quantifications were done with Image StudioTM

Lite software

Protein-fragment complementation assays

For the global PCA experiment baits consisted of 15 proteins fused to 2x3x4xL-DHFR

F[12] that are part of seven complexes Prey proteins fused to the 2xL-DHFR F[3] (495

strains) were selected according to the criteria that they were belonging to the same

complexes as the baits or that they were interacting with one of them based on data reported

in BioGRID in October 2014 (96) A random set of 97 strains corresponding to proteins found

in the cytoplasm or the nucleus was also included in the set of preys as controls Each prey

was present in four replicates two on each prey plate so each interaction was measured four

times Preys were randomly positioned to avoid location biases

For the intra-complexes experiment we performed a review of the literature and considered

the consensus protein complexes published by (84) to choose 95 central and associated

proteins members of the following complexes the RNApol I II and III the proteasome and

the COG complex These complexes were selected because they vary in size (RNApol I

(n=14) II (n=12) III (n=17) and associated proteins (n=9 7 tested) proteasome (n=47 44

tested) and COG complex (n=8)) and interactions among protein members of these

complexes have been shown to be detectable at least partially by DHFR PCA In addition

there are published structures available for the RNApol and proteasome complexes making

it possible to compare our results with known protein complex organization We successfully

constructed 800 and 766 of the strains in MATa and 650 and 702 in MAT for the

RNApol and proteasome respectively and 100 for the COG complex In total 286 strains

harboring proteins fused to 2xL4xL-F[12] andor 2xL4xL-F[3] were used a representation

of 895 (85 out of the 95 proteins selected at first are tagged with 2xL and 4xL in at least

one mating type) of the proteins MATα 2xL4xL-DHFR F[3] cells were used as baits Two

different prey plates of MATa cells were generated including all strains mentioned above

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 31: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

18

Baits and preys were positioned in a way that in a block of four strains all combinations of

linker sizes could be tested for a specific interaction (2xL-2xL 2xL-4xL 4xL-2xL and 4xL-

4xL) Each block of bait-prey interactions was present in 14 replicates for the RNApol and

COG complexes and in 16 replicates for the proteasome complex The blocks were randomly

positioned on the colony arrays Each 1536-array was finally designed to contain a double

border of a strain showing a weak interaction (Pop2-2xL-F[12]-Arc35-2xL-F[3]) to avoid

any border effects on the growth of the colonies

Bait plates were first prepared from 10 mL saturated cultures in YPD+clonNAT (for MATa

cells) or YPD+HygB (for MATα cells) that were plated on YPD Omnitray plates and

incubated at 30degC for 24 h Cells were then printed on a 1536-array with a 1536-pin (or a

384-pin) replicating tool manipulated by a BM3-BC automated colony processing robot

(SampP Robotics) and incubated for another 24 h at 30degC In parallel prey plates were

assembled by arraying strains onto specific positions in a 96-format with a re-arraying tool

Colonies were further condensed in 384-format arrays and finally in 1536-format arrays

using a 96-pin and 384-pin replicating tool respectively Two different prey plates of 1536-

format were generated and replicated a few times to have enough cells to perform crosses

with all of the individual baits Second each 1536-bait plate was crossed with the two 1536-

prey plates with a 1536-pin replicating tool and incubated for two days at 30degC Two rounds

of diploid selection were performed on YPD+clonNAT+HygB with an incubation time of

two days at 30degC per round Finally diploid strains were replicated on MTX medium and

incubated at 30degC for four days after which a second round of MTX selection was performed

Plates were incubated at 30degC for another four days Images were taken with an EOS Rebel

T3i camera (Canon) each day from the second round of diploid selection to the end of the

experiment

For the global PCA experiment we confirmed by standard DHFR PCA 25 PPIs that

differences in signal were increased null or decreased The same procedure as described

above was used to assess the growth on MTX medium of selected diploid cells resulting from

a new cross between bait and prey strains Correlation between the results of the two

experiments can be seen in Fig S1E For the intra-complexes experiment we confirmed

results for 10 pairs of interacting proteins by measuring cell growth in a spot-dilution assay

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 32: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

19

(Fig S1F) Briefly precultures of diploid cells expressing 2xL4xL DHFR fragments fusions

to proteins of interest were adjusted to an OD600ml of 1 in water 5-fold serial dilutions were

performed and 6 microL of each dilution were spotted on MTX and DMSO DHFR PCA media

Plates were incubated for seven days at 30degC and subsequently imaged with an EOS Rebel

T3i camera (Canon)

PCA images and statistical analyses

For the initial screen colony size was estimated by measuring number of pixels using the

integrated intensity function as implemented in a custom script in ImageJ64 144o We

applied an image correction where the intensity of each pixel was extracted and the pixel

intensity matrix was smoothened using a two-way median polish and averaged with the raw

image We then converted the images to binary files and a manual threshold was applied

across plates We selected colonies for measurement with a circular selection using particle

detection with the built-in function ldquoAnalyze particlerdquo in ImageJ64 We excluded particles

touching the edge of the selection and those that had an area inferior to 20 pixels and

circularity inferior to 05 using the particle that is closest to the center We considered the

particle as being a colony if the mass center was within the mid-distance between two

colonies All plate images were also examined The average of the background pixels was

subtracted from the colony intensity

Colony intensity values from day 4 of growth of the second MTX selection were log2

transformed after adding 1 to each value to avoid null values All colonies with a size smaller

than 16 on the diploid selection plate were eliminated

For the global PCA experiment interactions with at least two replicates for all linker

combinations were conserved and the median of colony size was used as the interaction score

(Is) For each combination of linkers (2xL-2xL 3xL-2xL 4xL-2xL) distribution of

interaction scores was modeled as a mixture of two normal distributions using the R package

mixtools (functional NormalmixEM) (Fig S1B) The estimated mean (b) and standard

deviation (sdb) of the background distribution was used to convert each interaction score into

a z-score (Zs = (Is ndash b)sdb)) Interactions with a Zs greater than 25 were considered as

significant detected interactions These Zs were used to compare the same interaction with

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 33: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

20

different linker size combinations We considered significant changes when Zs differed by

more than 2

For the intra-complexes experiment extreme outliers on the MTX selection plates that were

more distant from the median than Q1-3(Q3-Q1) or Q3+3(Q3-Q1) were excluded (Q1 and

Q3 represent first and third quartiles) Colonies corresponding to the control interaction and

positioned on the array edges were removed from downstream analyses as well as strains for

which sequencing results revealed mutations in the DHFR fusion proteins After these final

filtering steps interactions with at least four replicates for every linker combinations were

conserved and the median of colony size was used as the Is Significant interactions were

identified as described above (Fig S1B) For the RNApol and the proteasome the estimated

mean (b) and standard deviation (sdb) of the background distribution were calculated for

each linker combination and each complex separately For the COG complex because the

number of pairwise interactions is limited to 64 all the results were combined to calculate

these parameters An interaction was considered as being detected when the Zs was larger

than 25 From the 236 protein pairs presenting detected interactions with at least one linker

combination some pairs were filtered out mainly because they did not pass all of the

thresholds or because the fusion strains (Taf14 and Spt5 fused to DHFR F[3]) presented

incoherent results for all tested interactions leaving us with a total of 228 (197 unique) pairs

of interacting proteins

At this step pairs of interacting proteins presenting a new interaction (ie the interaction was

not detected with the reference linker size (2xL-2xL) but was detected with a longer linker

combination) were separated from others and classified as new interactions (Table S1C) For

the remaining pairs because baits and preys were positioned in a way that in a block of four

adjacent strains all combinations of linker lengths could be tested for a specific interaction

(2xL-2xL 2xL-4xL 4xL-2xL and 4xL-4xL) Is for the different linker size combinations

could be compared directly The difference with the reference 2xL-2xL interaction was

calculated for each linker combination 2xL-4xL 4xL-2xL and 4xL-4xL A paired t-test was

used to discriminate significant difference in colony size (with FDR corrected p-values)

These pairs of interacting proteins were separated in two additional categories unchanged

interactions in cases where the interaction was detected with the reference linker size (2xL-

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 34: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

21

2xL) and also with the longer linker combinations but without any significant change (t-test

FDR p-value above 005) and quantitative changes in cases where the interaction was

detected with the reference linker size (2xL-2xL) and presented significant changes for at

least one longer linker combinations (Difference greater than 1 or smaller than -1 with t-test

FDR p-value lt 005) (Table S1C)

Analysis of protein distances within complexes

Yeast protein sequences of the RNApol I II and III were obtained from SGD

(httpwwwyeastgenomeorg) and searched through the RNApol I II and III protein

complexes of the RCSB protein data bank (httpwwwrcsborg) using usearch software

PDB files 4C3I 4V1N and 5FJA were selected as representative monomeric complexes for

the RNApol I II and III respectively as they included the largest number of proteins from the

experimental set with the highest sequence identities Similarly structure 4C2M was selected

as the representative RNApol I dimeric complex Table S2B presents the identity between

each RNApol structures and the experimental sequences

The proteasome is composed of three sections the barrel-shaped core particle the base and

the lid (Fig S2A top) There was no complete structure of the yeast proteasome complex in

the RCSB protein data bank at the time of the analyses Sequence alignment of the

experimental protein sequences of the individual sections of the proteasome complex with

the sequences of the RCSB protein data bank identified PDB IDs 5A5B and 5CZ4 Structure

PDB ID 5A5B is composed of the base the lid and half of the core Structure PDB ID 5CZ4

is composed of a full core A complete proteasome structure was built by superposing two

PDB 5A5B structures on the structure of 5CZ4 one on each side of the CP using the super

command in PyMOL software Visual inspection of the resulting superposed 5A5B structures

showed an incorrect overlap in the central core (Fig S2B) This overlap is well solved in

5CZ4 Thus final proteasome structure was composed of 5A5B for the base the lid and the

outer rings of the core The inner rings of the core were from structure 5CZ4 Fig S2A

summarizes the methodology used to build the final proteasome structure Table S2C

presents the identity between the built structure and the experimental sequences

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 35: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

22

The distances between the different proteins within a complex were calculated between C-

terminal residues In several cases the structure of the protein is not complete in the C-

terminal section In these cases the last available residue was used instead to calculate the

distance (a list is provided in Table S2D) The distances were calculated from the weighted

shortest path using the dijkstra algorithm as implemented in NetworkX (example of shortest

path between Scl1p and Rpn5p is presented in Fig S2C) Surface residues Cα were used as

nodes to build the graph The edges of the graph were placed between each pair of nodes

using a distance cutoff of 15 Å for the RNApol II and of 30 Å for the proteasome The weight

of the edges was equal to the distance between node pairs Surface residues were identified

as follows First the structure of the protein complex was represented using the ldquoshow dotsrdquo

and ldquoset dots_solventrdquo commands in PyMOL using a solvent radius of 10 Å for the RNApol

II complex and of 20 Å for the proteasome respectively These dots were exported in the

ldquowrlrdquo graphic file format From this file each dot coordinates were extracted Residues

within 15 Å of any dot of the RNApol II structure and within 20 Å of the proteasome

structure were considered as surface residues (see Fig S2D for a representation of the method

for the proteasome) In cases where multiple copies of the proteins were present within the

complexes the mean of the minimal distances possible was used for the analyses

All PPIs data related to the global PCA and intra-complexes experiments can be found in

Table S1B and S1C

Results and discussion

Longer linkers increase signal-to-noise ratio in large-scale screens

The standard linker used in DHFR PCA consists of two repetitions of the peptide GGGGS

(55) which we refer to as the 2x-linker (2xL) We modified existing plasmids to include

three and four repetitions of this sequence (referred to as 3xL and 4xL) and used them as

PCR template for both complementary DHFR fragments (DHFR F[12] and DHFR F[3]) to

be introduced in yeast (Table S1A for strains used in this study) We assessed whether longer

linkers destabilize proteins and therefore interfere with the detection of PPIs No evidence of

protein degradation was found for any of the six proteins examined using antibodies targeting

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 36: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

23

the endogenous proteins (Fig S1A) suggesting that if linker length affects protein stability

it has a minor effect that is not generalized

To verify the effect of longer linker length on the detection of PPIs by DHFR PCA (55) we

constructed reporter strains for 15 proteins that are part of seven complexes with the 2xL

3xL and 4xL fused to the DHFR F[12] fragment each time Using high-density yeast colony

arrays (57) we queried these baits (n=45) against 592 prey proteins fused to DHFR F[3]

(with regular 2xL) These include proteins known to interact with the baits that are within

the same complexes as the baits or that are random proteins used as controls for a total of

26640 potential interactions in four replicates (Table S1B) We detected 99 110 and 126

PPIs (z-score greater than 25) with the 2xL 3xL and 4xL respectively (Fig S1B top left

panel) revealing a significant increase in signal-to-noise ratio with longer linkers

particularly for the 4xL Four and seven PPIs showed greater than two-fold z-score

differences with the 3xL (two decreases two increases) and the 4xL (seven increases) as

compared to the 2xL assay (Fig 1A) Decreased interactions may represent steric effects that

reduce signal due to the fusion of the DHFR fragments Four out of nine increased

interactions were reported by affinity-capture mass spectrometry (18) but not by PCA with

standard linkers suggesting that longer linkers may allow for the detection of PPIs that are

not necessarily direct Moreover the four interactions with the highest PCA signal represent

cases between baits and preys within the same complexes suggesting that there is no decrease

in specificity with the elongated linkers Finally for the cases where proteins were not in the

same complex or were not previously shown to interact it is likely that they represent actual

interactions previously undetected in living cells For example many genetic interactions and

physical interactions (in vitro and in vivo) have been described between the actin cytoskeleton

and the proteasome (97 98) Here we detect some interactions in living cells (such as

between Arc18 and Pup1) often with an increased signal with the 4xL compared to the 2xL

(Table S1B) All of these results thus show that the DHFR PCA with increased linker size

reveals new interactions and could be an improved tool to study inter-complex associations

PCA signal reflects the super-organization of protein complexes

To examine the effect of a longer linker on the detection of PPIs within complexes we

selected five complexes (RNApol I II and III proteasome and COG complexes) which

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 37: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

24

differ in protein sizes We used four combinations of linker lengths (2xL-2xL 2xL-4xL 4xL-

2xL 4xL-4xL) for all proteins within a complex As a negative control tests for PPIs between

the RNApol I II and III and COG complex were also performed Among the 10192 unique

tested PPIs 755 interactions were considered as true PPIs (Fig S1B and Table S1C)

representing PPIs among 228 protein pairs (197 unique - reciprocal interactions such as X-

DHFR F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] accounting for only one

PPI) after filtration

As expected no interaction was detected between the RNApol and COG proteins Moreover

reciprocal PPI signals ie X-DHFR F[12]-Y-DHFR F[3] versus Y-DHFR F[12]-X-DHFR

F[3] were correlated as previously noted (55) (Fig S1C - 4xL-4xL PPIs) Also for almost

60 of interacting pairs (135228 or 114197 unique) no significant change on the

interaction strength was observed when using the 4xL compared to the 2xL reinforcing the

fact that no overall decrease in specificity is seen with the elongated linkers However the

increased linker length had an obvious impact for 93 (83 unique) interacting pairs (Fig 1B)

PCA signal was indeed quantitatively changed for 19 (18 unique) interacting pairs and 74

(65 unique) new PPIs were detected using at least one 4xL Thus doubling the linker length

can substantially widen the repertoire of detected interactions for a complex

In general having only one longer linker (mainly 4xL-DHFR F[12]) was sufficient for the

detection of new interactions or to increase the PCA signal of a previously detected PPI (2xL-

4xL compared with 2xL-2xL) However the signal was often improved with the 4xL-4xL

combination In rare cases increasing linker length had an opposite effect leading to PPI

loss or signal reduction Rpo21 was particularly affected This protein one of the two largest

components of the RNApol II contributes to five out of the nine quantitatively decreased

interactions Rpo21-4xL keeps its interactions with its main partners (Rpb2 and Rpb3 (99))

but seems to lose all of the others This consequence may thus arise from steric effects rather

than through the destabilization of the protein (Fig 1D)

Quantitative changes were observed for about 5-10 of the detected PPIs across complexes

However a larger proportion (about 30-40) of new interactions were detected for RNApol

complexes compared to the proteasome and the COG complex (Fig 1C) Within the RNApol

complexes more than half of the new interactions were found between proteins common to

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 38: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

25

the three polymerases (Rpb5 Rpb10 and Rpo26) and proteins specific to each of the

individual polymerase (Fig 1D left panel) In the proteasome five new interactions involved

Nas6 an assembly chaperone for the proteasome and proteins from the base subunit (Fig 1D

center panel) In the COG complex new interactions were seen between Cog1 from the core

subunit and proteins from the lobe a or lobe b (Fig 1D right panel) All these results show

that doubling the linker length of central proteins in complexes expands the network of

interactions detected by DHFR PCA and helps to better describe the organization of protein

complexes in living cells

In addition to uncovering new interactions PCA signal using longer linkers allowed better

discrimination between the different subunits of large complexes This is particularly well

illustrated with the proteasome (Fig 1D and 1E center panels) More PPIs are detected when

the two proteins are in the same subcomplex (such as base-base core-core and lid-lid)

regardless of the linker length though the fraction is systematically higher with longer linkers

The same trend is observed for the RNApol and COG complexes (Fig 1D and 1E left and

right panels) Structural biology in living cells could thus gain from PPIs data obtained with

several linker lengths

Longer linkers allow detection of more distant proteins in complexes

Because structural data for the RNApol and proteasome complexes were available we tested

whether the PCA signal with longer linkers reflects at least partly the proximity of proteins

within complexes as suggested by the analysis on subcomplexes As a proxy for distance

we measured the shortest path between C-termini of the proteins of interest (Table S2A) We

find that interaction z-scores often reflect the distance between proteins (Fig 2A) For the

proteasome the complex for which we have the most distance values a negative correlation

is observed between the pairwise distance and interaction z-score of PPIs for all lengths of

linkers (Fig 2B left panel) The stronger correlation for longer linkers is likely due to a better

signal-to-noise ratio The enhanced ability to detect interactions at longer distances with

longer linker sizes is clearly visible from the cumulative distribution of z-scores as a function

of pairwise distances where positive z-scores accumulate to a longer distance for the 4xL-

4xL combination than the other combinations (Fig 2B right panel) The density distribution

of distances within complexes is also slightly shifted towards larger distances for longer

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 39: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

26

linkers showing that longer distances are better detectable with longer linker sizes (Fig S1D)

Finally we find that distance among proteins is significantly longer for cases where longer

linker size increases signal or leads to the detection of new interactions (Fig 2C) This

demonstrate once again that longer linker size enhances the ability to detect interactions

especially for proteins that are more distant in space

Conclusion

Understanding the molecular organization of the cell at the scale of protein complexes

remains challenging largely because it is difficult to study how proteins interact directly and

indirectly in vivo (88) Progress requires that we adapt or develop tools to detect and measure

protein proximity in living cells and among endogenously expressed proteins Here we show

that DHFR PCA with a modest increase in linker size from 41 Aring to 82 Aring can be used to

detect interactions in these specific conditions with an increased signal-to-noise ratio and

with an enhanced ability to detect distant PPIs including interactions among complexes and

subcomplexes within large complexes Because a single longer linker is generally sufficient

to detect new interactions the current strains from the DHFR PCA collection could be used

as preys while requiring only the construction of baits with different linker sizes PCA is

therefore an addition to the other methods available to detect low resolution structural

information among subunits of complexes which include chemical cross-linking of protein

complexes (100) FRET-based analyses (101) and BioID proximity-dependent biotinylation

in mammalian cells (68) Despite major advances in these other technologies in the recent

years PCA will remain the simplest assay because it requires minimal infrastructure

investment and can be adapted for high-throughput screening which is still difficult to

achieve with other approaches

Acknowledgements

Funding for this project comes from Canadian Institute of Health Research Grants 299432

and 324265 to CRL CRL holds the Canadian Research Chair in Evolutionary Cell and

Systems Biology AEC was supported by fellowships from CIHR and FRSQ CL was

supported by a NSERC NRSA Scholarship The authors thank the members of the Landry

laboratory for feedback on the manuscript and Marie Filteau for guidance on the statistical

analyses

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 40: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

27

Figure 1 Longer linkers increase signal-to-noise ratio in a large-scale Protein-fragment

complementation (PCA) screen and proves to be useful to infer the super-organization

of protein complexes

(A) PPIs z-scores (representing a quantitative deviation from the background noise) obtained

in a large-scale screen using baits fused to the DHFR F[12] fragment with a 3xL (left) and a

4xL (right) compared to a 2xL PPIs with a significant difference are highlighted with red

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 41: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

28

triangles (3xL) and squares (4xL) (B) Detected PPIs after data filtering for the intra-

complexe PCA experiment Blue circle RNApol I II and III Orange square proteasome

Purple triangle COG complex Empty shapes quantitatively changed PPIs (significantly

decreased or increased when compared to 2xL-2xL reference interaction) Solid shapes new

PPIs (PPI not detected with the 2xL-2xL reference linker but detected with a longer linker

combination) (C) Proportions of quantitatively changed interactions and new PPIs versus

unchanged PPIs for all complexes considering every reciprocal interactions such as X-DHFR

F[12]-Y-DHFR F[3] and Y-DHFR F[12]-X-DHFR F[3] as a single PPI (D) Circle plots of

all detected PPIs for selected complexes Line thickness is proportional to the difference

between the 4xL-4xL and 2xL-2xL PCA signal for each PPI Gray lines unchanged PPIs

Green lines decreased PPIs Pink lines increased and new PPIs Stripe patterns inside

colored boxes represent proteins that were absent from the experiment (E) Proportion of

detected PPIs on total tested for each combination of subcomplexes within complexes

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 42: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

29

Figure 2 Longer linkers allow for the detection of more distant proteins within

complexes

(A) Structures of RNApol I II and III and of the proteasome Green proteins shared by at

least two out of the three RNApol Blue proteins specific to one RNApol Dark red

proteasome catalytic subunit Red proteasome base Orange proteasome lid Proteins

located at different distances or in different subunits are highlighted on each structure

Distances between C-termini of these selected proteins and the associated PPI z-scores for

these newly detected interactions are indicated in the tables DHFR fragments have also been

modeled and are presented at the same scale as the proteasome structure (B) (Left)

Correlation between all detected PPIs in the proteasome (z-scores) and the distance between

the C-termini (2xL-2xL Spearman r = -034 p-value = 2249e-15 2xL-4xL r = -036 p-

value lt 22e-16 4xL-2xL r = -036 p-value lt 22e-16 4xL-4xL r = -040 p-value lt 22e-

16) Data were binned into ten distance classes (Right) Distribution of cumulative z-scores

for the proteasome PPIs according to the different protein pairwise distances (C) Distribution

of three categories of detected PPIs for the RNApol and proteasome complexes according to

the distance between the C-termini for interactions that are not affected by longer linkers and

those that increase in signal or that are newly detected p-values of Wilcoxon tests are shown

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 43: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

30

Table S1A Description of the strains constructed and used for this study

Table S1A is too lengthy to be included in this document but can be obtained upon request

Table S1B PCA data for global PCA experiment

Table S1B is too lengthy to be included in this document but can be obtained upon request

Table S1C PCA data for intra-complexes experiment

Table S1C is too lengthy to be included in this document but can be obtained upon request

Table S1D PCR primers used in this study

Table S1D is too lengthy to be included in this document but can be obtained upon request

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 44: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

31

Table S2A Distances between C-termini calculated from molecular modeling

Table S2A is too lengthy to be included in this document but can be obtained upon request

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 45: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

32

Table S2B Identity between each RNApol structures and the experimental sequences

Reference Yeast proteins Complex Identity ()

4C2M chain 1 Rpc10 RNApol I 100

4C2M chain 2 Rpa34 RNApol I 924

4C2M chain 3 Rpa49 RNApol I 944

4C2M chain 4 Rpa43 RNApol I 100

4C2M chain 5 Rpa190 RNApol I 897

4C2M chain 6 Rpc40 RNApol I 100

4C2M chain 7 Rpa135 RNApol I 972

4C2M chain 8 Rpb5 RNApol I 100

4C2M chain 9 Rpa14 RNApol I 596

4C2M chain 10 Rpa43 RNApol I 814

4C2M chain 11 Rpo26 RNApol I 100

4C2M chain 12 Rpa12 RNApol I 100

4C2M chain 13 Rpb8 RNApol I 882

4C2M chain 14 Rpc19 RNApol I 100

4C2M chain 15 Rpb10 RNApol I 100

4C2M chain 16 Rpa49 RNApol I 100

4C2M chain 17 Rpc10 RNApol I 100

4C2M chain 18 Rpa43 RNApol I 100

4C2M chain 19 Rpa34 RNApol I 924

4C2M chain 20 Rpa135 RNApol I 962

4C2M chain 21 Rpa190 RNApol I 885

4C2M chain 22 Rpa14 RNApol I 551

4C2M chain 23 Rpc40 RNApol I 100

4C2M chain 24 Rpo26 RNApol I 100

4C2M chain 25 Rpb5 RNApol I 100

4C2M chain 26 Rpb8 RNApol I 882

4C2M chain 27 Rpa43 RNApol I 802

4C2M chain 28 Rpb10 RNApol I 100

4C2M chain 29 Rpa12 RNApol I 96

4C2M chain 30 Rpc19 RNApol I 100

4C3I chain A Rpa190 RNApol I 892

4C3I chain C Rpc40 RNApol I 993

4C3I chain B Rpa135 RNApol I 982

4C3I chain E Rpb5 RNApol I 100

4C3I chain D Rpa14 RNApol I 551

4C3I chain G Rpa43 RNApol I 783

4C3I chain F Rpo26 RNApol I 100

4C3I chain I Rpa12 RNApol I 100

4C3I chain H Rpb8 RNApol I 847

4C3I chain K Rpc19 RNApol I 100

4C3I chain J Rpb10 RNApol I 100

4C3I chain M Rpa49 RNApol I 972

4C3I chain L Rpc10 RNApol I 100

4C3I chain N Rpa34 RNApol I 88

4V1N chain A Rpo21 RNApol II 979

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 46: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

33

4V1N chain C Rpb3 RNApol II 100

4V1N chain B Rpb2 RNApol II 936

4V1N chain E Rpb5 RNApol II 100

4V1N chain D Rpb4 RNApol II 808

4V1N chain G Rpb7 RNApol II 100

4V1N chain F Rpo26 RNApol II 100

4V1N chain I Rpb9 RNApol II 100

4V1N chain H Rpb8 RNApol II 91

4V1N chain K Rpb11 RNApol II 100

4V1N chain J Rpb10 RNApol II 100

4V1N chain L Rpc10 RNApol II 100

4V1N chain R Tfg2 RNApol II 603

5FJA chain A Rpo31 RNApol III 962

5FJA chain C Rpc40 RNApol III 100

5FJA chain B Ret1 RNApol III 100

5FJA chain E Rpb5 RNApol III 100

5FJA chain D Rpc17 RNApol III 739

5FJA chain G Rpc25 RNApol III 858

5FJA chain F Rpo26 RNApol III 100

5FJA chain I Rpc11 RNApol III 827

5FJA chain H Rpb8 RNApol III 945

5FJA chain K Rpc19 RNApol III 100

5FJA chain J Rpb10 RNApol III 100

5FJA chain M Rpc37 RNApol III 849

5FJA chain L Rpc10 RNApol III 100

5FJA chain O Rpc82 RNApol III 843

5FJA chain N Rpc53 RNApol III 738

5FJA chain Q Rpc31 RNApol III 100

5FJA chain P Rpc34 RNApol III 572

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 47: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

34

Table S2C Identity between proteasome structure and the experimental sequence

Reference Yeast

proteins Complex

Identity

()

5CZ4-centered chain A Pre8 Proteasome 100

5CZ4-centered chain AA Pre4 Proteasome 100

5CZ4-centered chain B Pre9 Proteasome 100

5CZ4-centered chain BA Pre3 Proteasome 100

5CZ4-centered chain C Pre6 Proteasome 100

5CZ4-centered chain D Pup2 Proteasome 971

5CZ4-centered chain E Pre5 Proteasome 100

5CZ4-centered chain F Pre10 Proteasome 100

5CZ4-centered chain G Scl1 Proteasome 100

5CZ4-centered chain H Pup1 Proteasome 100

5CZ4-centered chain I Pup3 Proteasome 100

5CZ4-centered chain J Pre1 Proteasome 100

5CZ4-centered chain K Pre2 Proteasome 100

5CZ4-centered chain L Pre7 Proteasome 100

5CZ4-centered chain M Pre4 Proteasome 100

5CZ4-centered chain N Pre3 Proteasome 100

5CZ4-centered chain O Pre8 Proteasome 100

5CZ4-centered chain P Pre9 Proteasome 100

5CZ4-centered chain Q Pre6 Proteasome 100

5CZ4-centered chain R Pup2 Proteasome 971

5CZ4-centered chain S Pre5 Proteasome 100

5CZ4-centered chain T Pre10 Proteasome 100

5CZ4-centered chain U Scl1 Proteasome 100

5CZ4-centered chain V Pup1 Proteasome 100

5CZ4-centered chain W Pup3 Proteasome 100

5CZ4-centered chain X Pre1 Proteasome 100

5CZ4-centered chain Y Pre2 Proteasome 100

5CZ4-centered chain Z Pre7 Proteasome 100

5A5B-centered chain A Pre3 Proteasome 100

5A5B-centered chain AA Rpn7 Proteasome 100

5A5B-centered chain B Pup1 Proteasome 100

5A5B-centered chain BA Rpn3 Proteasome 100

5A5B-centered chain C Pup3 Proteasome 100

5A5B-centered chain CA Rpn12 Proteasome 100

5A5B-centered chain D Pre1 Proteasome 100

5A5B-centered chain DA Rpn8 Proteasome 829

5A5B-centered chain E Pre2 Proteasome 995

5A5B-centered chain EA Rpn11 Proteasome 895

5A5B-centered chain F Pre7 Proteasome 100

5A5B-centered chain FA Rpn10 Proteasome 100

5A5B-centered chain G Pre4 Proteasome 100

5A5B-centered chain GA Rpn13 Proteasome 100

5A5B-centered chain HA Sem1 Proteasome 100

5A5B-centered chain IA Rpn1 Proteasome 859

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 48: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

35

5A5B-centered chain J Scl1 Proteasome 100

5A5B-centered chain K Pre8 Proteasome 100

5A5B-centered chain L Pre9 Proteasome 100

5A5B-centered chain M Pre6 Proteasome 100

5A5B-centered chain N Pup2 Proteasome 100

5A5B-centered chain O Pre5 Proteasome 100

5A5B-centered chain P Pre10 Proteasome 100

5A5B-centered chain Q Rpt1 Proteasome 88

5A5B-centered chain R Rpt2 Proteasome 100

5A5B-centered chain S Rpt6 Proteasome 100

5A5B-centered chain T Rpt3 Proteasome 100

5A5B-centered chain U Rpt4 Proteasome 100

5A5B-centered chain V Rpt5 Proteasome 931

5A5B-centered chain W Rpn2 Proteasome 909

5A5B-centered chain X Rpn9 Proteasome 100

5A5B-centered chain Y Rpn5 Proteasome 100

5A5B-centered chain Z Rpn6 Proteasome 100

Constructed proteasome chain 1 Pup1 Proteasome 100

Constructed proteasome chain 10 Pre8 Proteasome 100

Constructed proteasome chain 11 Pre9 Proteasome 100

Constructed proteasome chain 12 Pre6 Proteasome 100

Constructed proteasome chain 13 Pup2 Proteasome 100

Constructed proteasome chain 14 Pre5 Proteasome 100

Constructed proteasome chain 15 Pre10 Proteasome 100

Constructed proteasome chain 16 Rpt1 Proteasome 88

Constructed proteasome chain 17 Rpt2 Proteasome 100

Constructed proteasome chain 18 Rpt6 Proteasome 100

Constructed proteasome chain 19 Rpt3 Proteasome 100

Constructed proteasome chain 2 Pup3 Proteasome 100

Constructed proteasome chain 20 Rpt4 Proteasome 100

Constructed proteasome chain 21 Rpt5 Proteasome 931

Constructed proteasome chain 22 Rpn2 Proteasome 909

Constructed proteasome chain 23 Rpn9 Proteasome 100

Constructed proteasome chain 24 Rpn5 Proteasome 100

Constructed proteasome chain 25 Rpn6 Proteasome 100

Constructed proteasome chain 26 Rpn7 Proteasome 100

Constructed proteasome chain 27 Rpn3 Proteasome 100

Constructed proteasome chain 28 Rpn12 Proteasome 100

Constructed proteasome chain 29 Rpn8 Proteasome 829

Constructed proteasome chain 3 Pre1 Proteasome 100

Constructed proteasome chain 30 Rpn11 Proteasome 895

Constructed proteasome chain 31 Rpn10 Proteasome 100

Constructed proteasome chain 32 Rpn13 Proteasome 100

Constructed proteasome chain 33 Sem1 Proteasome 100

Constructed proteasome chain 34 Rpn1 Proteasome 859

Constructed proteasome chain 35 Pup1 Proteasome 100

Constructed proteasome chain 36 Pup3 Proteasome 100

Constructed proteasome chain 37 Pre1 Proteasome 100

Constructed proteasome chain 38 Pre2 Proteasome 100

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 49: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

36

Constructed proteasome chain 39 Pre7 Proteasome 100

Constructed proteasome chain 4 Pre2 Proteasome 100

Constructed proteasome chain 40 Pre4 Proteasome 100

Constructed proteasome chain 41 Pre3 Proteasome 100

Constructed proteasome chain 42 Pre4 Proteasome 100

Constructed proteasome chain 45 Scl1 Proteasome 100

Constructed proteasome chain 46 Pre8 Proteasome 100

Constructed proteasome chain 47 Pre9 Proteasome 100

Constructed proteasome chain 48 Pre6 Proteasome 100

Constructed proteasome chain 49 Pup2 Proteasome 100

Constructed proteasome chain 5 Pre7 Proteasome 100

Constructed proteasome chain 50 Pre5 Proteasome 100

Constructed proteasome chain 51 Pre10 Proteasome 100

Constructed proteasome chain 52 Rpt1 Proteasome 88

Constructed proteasome chain 53 Rpt2 Proteasome 100

Constructed proteasome chain 54 Rpt6 Proteasome 100

Constructed proteasome chain 55 Rpt3 Proteasome 100

Constructed proteasome chain 56 Rpt4 Proteasome 100

Constructed proteasome chain 57 Rpt5 Proteasome 931

Constructed proteasome chain 58 Rpn2 Proteasome 909

Constructed proteasome chain 59 Rpn9 Proteasome 100

Constructed proteasome chain 6 Pre3 Proteasome 100

Constructed proteasome chain 60 Rpn5 Proteasome 100

Constructed proteasome chain 61 Rpn6 Proteasome 100

Constructed proteasome chain 62 Rpn7 Proteasome 100

Constructed proteasome chain 63 Rpn3 Proteasome 100

Constructed proteasome chain 64 Rpn12 Proteasome 100

Constructed proteasome chain 65 Rpn8 Proteasome 829

Constructed proteasome chain 66 Rpn11 Proteasome 895

Constructed proteasome chain 67 Rpn10 Proteasome 100

Constructed proteasome chain 68 Rpn13 Proteasome 100

Constructed proteasome chain 69 Sem1 Proteasome 100

Constructed proteasome chain 70 Rpn1 Proteasome 859

Constructed proteasome chain 9 Scl1 Proteasome 100

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 50: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

37

Table S2D Number of missing residues in C-termini of studied proteins in RNApol I

II and III and proteasome structures

Yeast proteins Complex Reference of missing residues in C-ter

Rpa190 RNApol I 4C2M monomer 1 0

Rpa14 RNApol I 4C2M monomer 1 37

Rpa12 RNApol I 4C2M monomer 1 0

Rpb5 RNApol I 4C2M monomer 1 0

Rpb10 RNApol I 4C2M monomer 1 1

Rpa49 RNApol I 4C2M monomer 1 300

Rpc19 RNApol I 4C2M monomer 1 0

Rpb8 RNApol I 4C2M monomer 1 0

Rpa34 RNApol I 4C2M monomer 1 52

Rpa43 RNApol I 4C2M monomer 1 10

Rpc40 RNApol I 4C2M monomer 1 0

Rpc10 RNApol I 4C2M monomer 1 0

Rpa135 RNApol I 4C2M monomer 1 0

Rpo26 RNApol I 4C2M monomer 1 1

Rpa190 RNApol I 4C2M monomer 2 0

Rpa14 RNApol I 4C2M monomer 2 37

Rpa12 RNApol I 4C2M monomer 2 0

Rpb5 RNApol I 4C2M monomer 2 0

Rpb10 RNApol I 4C2M monomer 2 1

Rpa49 RNApol I 4C2M monomer 2 300

Rpc19 RNApol I 4C2M monomer 2 0

Rpb8 RNApol I 4C2M monomer 2 0

Rpa34 RNApol I 4C2M monomer 2 53

Rpa43 RNApol I 4C2M monomer 2 76

Rpc40 RNApol I 4C2M monomer 2 0

Rpc10 RNApol I 4C2M monomer 2 0

Rpa135 RNApol I 4C2M monomer 2 0

Rpo26 RNApol I 4C2M monomer 2 1

Rpa190 RNApol I 4C3I 1

Rpa14 RNApol I 4C3I 37

Rpb5 RNApol I 4C3I 0

Rpb10 RNApol I 4C3I 1

Rpa49 RNApol I 4C3I 301

Rpc19 RNApol I 4C3I 0

Rpb8 RNApol I 4C3I 0

Rpa34 RNApol I 4C3I 53

Rpa12 RNApol I 4C3I 0

Rpa43 RNApol I 4C3I 10

Rpc40 RNApol I 4C3I 0

Rpc10 RNApol I 4C3I 0

Rpa135 RNApol I 4C3I 0

Rpo26 RNApol I 4C3I 1

Rpb3 RNApol II 4V1N 50

Rpb11 RNApol II 4V1N 6

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 51: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

38

Rpb5 RNApol II 4V1N 0

Rpb7 RNApol II 4V1N 0

Rpb10 RNApol II 4V1N 5

Rpo26 RNApol II 4V1N 0

Rpb8 RNApol II 4V1N 0

Rpb4 RNApol II 4V1N 0

Rpb9 RNApol II 4V1N 2

Tfg2 RNApol II 4V1N 173

Rpb2 RNApol II 4V1N 0

Rpc10 RNApol II 4V1N 0

Rpo21 RNApol II 4V1N 278

Rpc11 RNApol III 5FJA 0

Rpc19 RNApol III 5FJA 0

Ret1 RNApol III 5FJA 0

Rpb5 RNApol III 5FJA 0

Rpb10 RNApol III 5FJA 3

Rpc37 RNApol III 5FJA 20

Rpc82 RNApol III 5FJA 0

Rpc31 RNApol III 5FJA 182

Rpb8 RNApol III 5FJA 0

Rpc53 RNApol III 5FJA 0

Rpc25 RNApol III 5FJA 0

Rpc34 RNApol III 5FJA 2

Rpo31 RNApol III 5FJA 0

Rpc40 RNApol III 5FJA 0

Rpc10 RNApol III 5FJA 0

Rpc17 RNApol III 5FJA 0

Rpo26 RNApol III 5FJA 2

Rpn6 Proteasome 5CZ4 and 5A5B 3

Rpn5 Proteasome 5CZ4 and 5A5B 3

Rpn3 Proteasome 5CZ4 and 5A5B 45

Rpn2 Proteasome 5CZ4 and 5A5B 20

Rpn1 Proteasome 5CZ4 and 5A5B 0

Rpn9 Proteasome 5CZ4 and 5A5B 6

Rpn8 Proteasome 5CZ4 and 5A5B 30

Pre10 Proteasome 5CZ4 and 5A5B 39

Pre6 Proteasome 5CZ4 and 5A5B 10

Pre7 Proteasome 5CZ4 and 5A5B 0

Rpt3 Proteasome 5CZ4 and 5A5B 0

Rpt2 Proteasome 5CZ4 and 5A5B 1

Pre2 Proteasome 5CZ4 and 5A5B 0

Rpt4 Proteasome 5CZ4 and 5A5B 10

Pre1 Proteasome 5CZ4 and 5A5B 3

Pre8 Proteasome 5CZ4 and 5A5B 0

Pre9 Proteasome 5CZ4 and 5A5B 12

Pup2 Proteasome 5CZ4 and 5A5B 9

Pup3 Proteasome 5CZ4 and 5A5B 0

Pup1 Proteasome 5CZ4 and 5A5B 6

Rpn13 Proteasome 5CZ4 and 5A5B 23

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 52: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

39

Rpn12 Proteasome 5CZ4 and 5A5B 2

Rpn11 Proteasome 5CZ4 and 5A5B 8

Rpn10 Proteasome 5CZ4 and 5A5B 71

Sem1 Proteasome 5CZ4 and 5A5B 0

Scl1 Proteasome 5CZ4 and 5A5B 0

Rpt1 Proteasome 5CZ4 and 5A5B 11

Pre4 Proteasome 5CZ4 and 5A5B 4

Pre5 Proteasome 5CZ4 and 5A5B 0

Rpt5 Proteasome 5CZ4 and 5A5B 0

Pre3 Proteasome 5CZ4 and 5A5B 0

Rpt6 Proteasome 5CZ4 and 5A5B 9

Rpn7 Proteasome 5CZ4 and 5A5B 7

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 53: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

40

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 54: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

41

Figure S1 Data related to the PCA experiments

(A) Western blots confirming that the introduction of a longer linker does not impair protein

stability Act1 protein was used as a loading control (B) Distribution of PPIs signal (colony

size) obtained in the global PCA (top left) and in the intra-complexes (Proteasome - top right

RNApol I II and III - bottom left and COG complex - bottom right) experiments PPIs with

a colony size above the threshold (dashed or gray lines) correspond to positive PPIs and have

a z-score above 25 (C) Example of correlation observed for PPI signals from reciprocal

interactions with the 4xL-4xL combination Correlation coefficients for the other

combinations are r=092 for 2xL-2xL r=053 for 2xL-4xL and r=074 for 4xL-2xL (D)

Density of PPI z-scores for the proteasome for all combinations of linker lengths according

to the distance between the interacting proteins The red line represents the density of

distances for all interactions The distribution for detected interactions is shifted to the left

because proteins are closer to each other when the interactions are detected The 4xL-4xL

distributions is also slightly shifted to the right due to the ability of the 4xL to detect

interactions further in space (E) Repetition of the standard DHFR PCA for selected results

for the global PCA experiment showing a strong reproducibility (F) Confirmation by DHFR

PCA in spot-dilution assay of selected results for the intra-complexes experiment Examples

for each category of changes are shown Cell growth in spot-dilution assay (right) correlates

with colony size in standard PCA (left)

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 55: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

42

Figure S2 Illustration of the methods used to build the proteasome structure and to

calculate distances between proteins

(A) (Top) PDB structure 5A5B Gray lid and base Red and yellow core (Middle) PDB

structure 5CZ4 composed of the full proteasome core (Bottom) 5A5B structures aligned on

the 5CZ4 structure (B) Final proteasome structure (Top) Result from the alignment of two

5A5B structures on the 5CZ4 structure as seen in (A) (Middle) Close view of the overlap

between the core from the two aligned 5A5B structures (left) and the 5CZ4 structure (right)

(Bottom) Final proteasome structure Gray lid and base Red cyan blue and yellow core

(C) Example of a distance weighted shortest path between the C-termini of Scl1 and Rpn5

Dark green Scl1 Light green Rpn5 Green spheres residues used to calculate the distance

weighted shortest path Magenta spheres C-terminal residues (D) Surface residues used for

distance weighted shortest path calculations Gray cartoon proteasome Purple spheres dots

surface Green spheres surface residues on the proteasome

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 56: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

43

Conclusion geacuteneacuterale

Le but de ce projet eacutetait de deacutevelopper une meacutethode hybride relativement simple Le terme

meacutethode hybride deacutesigne une meacutethode permettant de deacutetecter des associations entre des

proteacuteines agrave proximiteacute dans lrsquoespace sans qursquoelles ne soient neacutecessairement des interactions

physiques Cette meacutethode permettrait ainsi drsquoapprofondir et de mieux disseacutequer lrsquoarchitecture

des complexes proteacuteiques Concregravetement il srsquoagissait de modifier la longueur des

connecteurs de la DHFR PCA chez S cerevisiae Afin de valider la meacutethode il fallait drsquoabord

veacuterifier si lrsquoaugmentation de la longueur du connecteur permettait de modifier les interactions

deacutetecteacutees Il eacutetait eacutegalement pertinent de veacuterifier lrsquoapplication de la meacutethode pour lrsquoeacutetude de

complexes proteacuteiques agrave lrsquoaide de plusieurs combinaisons de connecteurs de diffeacuterentes

longueurs Enfin la confirmation de la validiteacute de la meacutethode pouvait ecirctre compleacuteteacutee par la

comparaison des reacutesultats obtenus avec les distances mesureacutees agrave partir des structures

proteacuteiques disponibles du proteacuteasome

Les reacutesultats de la premiegravere validation deacutemontrent qursquoen jouant sur un seul paramegravetre soit

en doublant la longueur drsquoun connecteur le ratio signal sur bruit a significativement

augmenteacute permettant une meilleure identification des associations Sept nouvelles

associations ont eacuteteacute observeacutees agrave lrsquointeacuterieur de complexes proteacuteiques et entre diffeacuterents

complexes notamment entre le proteacuteasome et le cytosquelette drsquoactine La nature des

associations deacutetecteacutees suggegravere que la speacutecificiteacute de la DHFR PCA est conserveacutee malgreacute la

modification de la longueur du connecteur Lrsquoeacutetude approfondie des cinq complexes

proteacuteiques montre que la variation de la DHFR PCA permet de deacutetecter de nouvelles

interactions en conservant la speacutecificiteacute de la meacutethode En effet parmi lrsquoensemble des

interactions uniques deacutetecteacutees plus de 30 eacutetaient nouvelles Donc on pourrait srsquoattendre agrave

obtenir pratiquement autant de nouvelles interactions si cette variation de la PCA eacutetait

appliqueacutee agrave des complexes proteacuteiques deacutejagrave eacutetudieacutes Ce pourcentage pourrait varier selon le

nombre de combinaisons de connecteurs de diffeacuterentes longueurs utiliseacute Par exemple ce

nombre pourrait ecirctre reacuteduit en nrsquoutilisant qursquoune seule combinaison puisque certaines

associations proteacuteine-proteacuteine eacutetaient uniquement deacutetectables avec une combinaison preacutecise

de connecteurs Lrsquoutilisation drsquoun connecteur allongeacute pour le fragment DHFR F[12] semble

ecirctre suffisante pour deacutetecter la majoriteacute des nouvelles PPI et celles dont le signal augmente

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 57: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

44

Les rares cas ougrave le signal diminuait avec lrsquoaugmentation de la longueur du connecteur

seraient davantage causeacutes par des effets steacuteriques plutocirct que par une deacutestabilisation des

proteacuteines impliqueacutees Cependant ces cas peuvent tout de mecircme fournir des informations

structurales notamment en identifiant les associations les plus fortes au sein du complexe

Par ailleurs lrsquoutilisation des connecteurs allongeacutes renseigne sur lrsquoorganisation des complexes

proteacuteiques particuliegraverement lorsqursquoelle implique les proteacuteines centrales Enfin les

associations deacutetecteacutees reflegravetent bien lrsquoorganisation des complexes proteacuteiques en sous-

complexes En comparant les distances entre les proteacuteines des structures du proteacuteasome et

les reacutesultats PCA obtenus il est possible de confirmer que lrsquoaugmentation de la longueur du

connecteur permet effectivement de deacutetecter des associations entre proteacuteines plus eacuteloigneacutees

dans lrsquoespace

La modification apporteacutee agrave la DHFR PCA preacutesente une belle avanceacutee dans lrsquoeacutetude des

associations proteacuteine-proteacuteine En doublant uniquement la longueur du connecteur du

fragment DHFR F[12] il est possible drsquoaccroicirctre la capaciteacute agrave deacutetecter des associations

proteacuteine-proteacuteine distantes Dans le cas drsquoexpeacuteriences futures il serait approprieacute drsquoutiliser le

connecteur standard en plus des connecteurs de longueurs additionnelles ce qui permettrait

drsquoavoir une validation et un comparatif et de deacutetecter des problegravemes qui seraient survenus

dans la construction des proteacuteines Par exemple il est plus facile de repeacuterer un problegraveme de

mauvaise recombinaison ou drsquoapparition de mutations En effet il serait possible de constater

la preacutesence drsquointeractions pour la proteacuteine correctement construite alors que celle

probleacutematique nrsquoen preacutesenterait aucune Toutefois il est certain que lrsquoajout de ce controcircle

complexifie les expeacuteriences et les analyses Malgreacute cet inconveacutenient cette variation de la

DHFR PCA donne accegraves agrave une meacutethode hybride additionnelle qui demeure relativement

simple Elle ne neacutecessite pas drsquoinfrastructure particuliegravere mais peut aussi ecirctre appliqueacutee agrave

grande eacutechelle agrave lrsquoaide drsquoune plateforme robotique Par ailleurs la DHFR PCA est une

meacutethode in vivo qui conserve le promoteur endogegravene pour lrsquoexpression des proteacuteines Les

fragments nrsquoont pas tendance agrave interagir spontaneacutement ensemble agrave lrsquoexception de srsquoils sont

tregraves rapprocheacutes ce qui reacuteduit les faux-positifs La DHFR PCA peut ecirctre faite soit en milieu

solide ou en milieu liquide Il est donc facile drsquoeacutetudier les PPI en preacutesence de plusieurs

conditions de croissance ou en preacutesence de perturbations cellulaires Elle peut drsquoailleurs ecirctre

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 58: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

45

suivie en temps reacuteel ce qui donne accegraves agrave lrsquoeacutetude de la dynamique des interactions (56) Ces

eacuteleacutements apportent certains avantages comparativement aux autres meacutethodes hybrides

Dans ce projet uniquement deux longueurs de connecteur ont eacuteteacute testeacutees Il serait inteacuteressant

drsquoeacutetablir une gamme de longueurs de connecteurs permettant drsquoavoir plusieurs reacutesolutions

du reacuteseau de PPI Il faudrait drsquoabord deacuteterminer la longueur maximale permettant de deacutetecter

des associations proteacuteine-proteacuteine plausibles limitant les faux-positifs Il faudrait aussi

deacuteterminer lrsquoincreacutementation optimale pour maximiser les nouvelles informations en prenant

en compte la complexiteacute additionnelle agrave chaque ajout de connecteurs La disponibiliteacute de

plateformes robotiques rend plus reacutealiste la creacuteation de collections de proteacuteines DHFR F[12]

avec diffeacuterentes longueurs de connecteur Lrsquoexistence de telles collections suppleacutementaires

permettrait drsquoavoir une image agrave diffeacuterentes reacutesolutions de preacutecise agrave grossiegravere du reacuteseau

drsquoassociations proteacuteine-proteacuteine de la levure En effet plus la longueur du connecteur est

augmenteacutee plus les associations deacutetecteacutees sont distantes ce qui diminue la reacutesolution

moleacuteculaire Avant drsquoinvestiguer plus exhaustivement un complexe proteacuteique il faudrait

prendre en consideacuteration ses caracteacuteristiques comme sa taille et sa flexibiliteacute Dans le cas de

petits complexes proteacuteiques il pourrait srsquoaveacuterer suffisant drsquoutiliser une reacutesolution plus fine

et donc des connecteurs plus courts alors que la reacutesolution devrait ecirctre moindre pour les

gros complexes proteacuteiques

La meacutethode deacuteveloppeacutee lors de ce projet de maicirctrise devient particuliegraverement inteacuteressante

pour lrsquoeacutetude des complexes proteacuteiques macromoleacuteculaires Ce sont des complexes dont la

composition nrsquoest pas parfaitement connue mais qui sont visibles en microscopie

eacutelectronique ou agrave lrsquoaide drsquoautres meacutethodes drsquoimagerie La taille de ces complexes limite

grandement leur eacutetude et repreacutesente un deacutefi dans la deacutetermination de leur architecture Les laquo

Processing bodies raquo et les granules de stress en sont un exemple Ils sont impliqueacutes

respectivement dans la deacutegradation et la conservation drsquoARN messager lors de stress

cellulaires et ils sont notamment relieacutes agrave diverses maladies telles que le cancer et le syndrome

de lrsquoimmunodeacuteficience acquise (102-104) Lrsquoeacutechelle de reacutesolution permise par

lrsquoallongement du connecteur nous permettrait drsquoavoir une conception geacuteneacuterale de leur

architecture Dans le cas du proteacuteome drsquoun organisme cette meacutethode apporterait une

meilleure deacutefinition de lrsquoorganisation de la machinerie cellulaire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 59: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

46

Bibliographie

1 Vidal M Cusick ME Barabasi AL Interactome networks and human disease Cell 2011144(6)986-98 2 Taylor SS Ilouz R Zhang P Kornev AP Assembly of allosteric macromolecular switches lessons from PKA Nature reviews Molecular cell biology 201213(10)646-58 3 Vandamme J Castermans D Thevelein JM Molecular mechanisms of feedback inhibition of protein kinase A on intracellular cAMP accumulation Cellular signalling 201224(8)1610-8 4 Conrad M Schothorst J Kankipati HN Van Zeebroeck G Rubio-Texeira M Thevelein JM Nutrient sensing and signaling in the yeast Saccharomyces cerevisiae FEMS microbiology reviews 201438(2)254-99 5 Broach JR RAS genes in Saccharomyces cerevisiae signal transduction in search of a pathway Trends in genetics TIG 19917(1)28-33 6 Fontana L Partridge L Longo VD Extending healthy life span--from yeast to humans Science 2010328(5976)321-6 7 Wong W Scott JD AKAP signalling complexes focal points in space and time Nature reviews Molecular cell biology 20045(12)959-70 8 Beuschlein F Fassnacht M Assie G Calebiro D Stratakis CA Osswald A et al Constitutive activation of PKA catalytic subunit in adrenal Cushings syndrome N Engl J Med 2014370(11)1019-28 9 Bult CJ Drabkin HJ Evsikov A Natale D Arighi C Roberts N et al The representation of protein complexes in the Protein Ontology (PRO) BMC Bioinformatics 201112371 10 Peters JM Cejka Z Harris JR Kleinschmidt JA Baumeister W Structural features of the 26 S proteasome complex J Mol Biol 1993234(4)932-7 11 Voges D Zwickl P Baumeister W The 26S proteasome a molecular machine designed for controlled proteolysis Annual review of biochemistry 1999681015-68 12 Tanaka K The proteasome overview of structure and functions Proceedings of the Japan Academy Series B Physical and biological sciences 200985(1)12-36 13 Wehmer M Sakata E Recent advances in the structural biology of the 26S proteasome Int J Biochem Cell Biol 201679437-42 14 Gomes AV Genetics of proteasome diseases Scientifica 20132013637629 15 Miller Z Ao L Kim KB Lee W Inhibitors of the immunoproteasome current status and future directions Current pharmaceutical design 201319(22)4140-51 16 Kaur G Batra S Emerging role of immunoproteasomes in pathophysiology Immunology and cell biology 201694(9)812-20 17 Rual J-F Venkatesan K Hao T Hirozane-Kishikawa T Dricot A Li N et al Towards a proteome-scale map of the human protein-protein interaction network Nature 2005437(7062)1173-8 18 Krogan NJ Cagney G Yu H Zhong G Guo X Ignatchenko A et al Global landscape of protein complexes in the yeast Saccharomyces cerevisiae Nature 2006440(7084)637-43 19 Collins SR Kemmeren P Zhao XC Greenblatt JF Spencer F Holstege FC et al Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae Molecular amp cellular proteomics MCP 20076(3)439-50 20 Gavin AC Aloy P Grandi P Krause R Boesche M Marzioch M et al Proteome survey reveals modularity of the yeast cell machinery Nature 2006440(7084)631-6 21 Giot L Bader JS Brouwer C Chaudhuri A Kuang B Li Y et al A protein interaction map of Drosophila melanogaster Science 2003302(5651)1727-36

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 60: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

47

22 Li S Armstrong CM Bertin N Ge H Milstein S Boxem M et al A map of the interactome network of the metazoan C elegans Science 2004303(5657)540-3 23 Rajagopala SV Sikorski P Kumar A Mosca R Vlasblom J Arnold R et al The binary protein-protein interaction landscape of Escherichia coli Nat Biotech 201432(3)285-90 24 Parrish JR Yu J Liu G Hines JA Chan JE Mangiola BA et al A proteome-wide protein interaction map for Campylobacter jejuni Genome Biology 20078(7)1-19 25 Wang Y Cui T Zhang C Yang M Huang Y Li W et al Global protein-protein interaction network in the human pathogen Mycobacterium tuberculosis H37Rv Journal of proteome research 20109(12)6665-77 26 Cherkasov A Hsing M Zoraghi R Foster LJ See RH Stoynov N et al Mapping the protein interaction network in methicillin-resistant Staphylococcus aureus Journal of proteome research 201110(3)1139-50 27 Hagen N Bayer K Rosch K Schindler M The intraviral protein interaction network of hepatitis C virus Molecular amp cellular proteomics MCP 201413(7)1676-89 28 Fossum E Friedel CC Rajagopala SV Titz B Baiker A Schmidt T et al Evolutionarily conserved herpesviral protein interaction networks PLoS pathogens 20095(9)e1000570 29 Stellberger T Hauser R Baiker A Pothineni VR Haas J Uetz P Improving the yeast two-hybrid system with permutated fusions proteins the Varicella Zoster Virus interactome Proteome science 201088 30 Obado SO Brillantes M Uryu K Zhang W Ketaren NE Chait BT et al Interactome Mapping Reveals the Evolutionary History of the Nuclear Pore Complex PLoS biology 201614(2)e1002365 31 Diss G Dube AK Boutin J Gagnon-Arsenault I Landry CR A systematic approach for the genetic dissection of protein complexes in living cells Cell Rep 20133(6)2155-67 32 Ferreira LG Oliva G Andricopulo AD Protein-protein interaction inhibitors advances in anticancer drug design Expert opinion on drug discovery 2016 33 Hamdi A Colas P Yeast two-hybrid methods and their applications in drug discovery Trends in pharmacological sciences 201233(2)109-18 34 Zoraghi R Reiner NE Protein interaction networks as starting points to identify novel antimicrobial drug targets Current opinion in microbiology 201316(5)566-72 35 Khare S Nagle AS Biggart A Lai YH Liang F Davis LC et al Proteasome inhibition for treatment of leishmaniasis Chagas disease and sleeping sickness Nature 2016 36 Sahni N Yi S Taipale M Fuxman Bass JI Coulombe-Huntington J Yang F et al Widespread macromolecular interaction perturbations in human genetic disorders Cell 2015161(3)647-60 37 Jensen LJ Bork P Biochemistry Not comparable but complementary Science 2008322(5898)56-7 38 Syafrizayanti Betzen C Hoheisel JD Kastelic D Methods for analyzing and quantifying protein-protein interaction Expert review of proteomics 201411(1)107-20 39 Marcilla M Albar JP Quantitative proteomics A strategic ally to map protein interaction networks IUBMB life 201365(1)9-16 40 Woods AG Sokolowska I Ngounou Wetie AG Wormwood K Aslebagh R Patel S et al Mass spectrometry for proteomics-based investigation Advances in experimental medicine and biology 20148061-32 41 Chen GI Gingras AC Affinity-purification mass spectrometry (AP-MS) of serinethreonine phosphatases Methods 200742(3)298-305 42 Dunham WH Mullin M Gingras AC Affinity-purification coupled to mass spectrometry basic principles and strategies Proteomics 201212(10)1576-90

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 61: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

48

43 Monti M Cozzolino M Cozzolino F Vitiello G Tedesco R Flagiello A et al Puzzle of protein complexes in vivo a present and future challenge for functional proteomics Expert review of proteomics 20096(2)159-69 44 Fields S Song O A novel genetic system to detect protein-protein interactions Nature 1989340(6230)245-6 45 Petschnigg J Moe OW Stagljar I Using yeast as a model to study membrane proteins Current opinion in nephrology and hypertension 201120(4)425-32 46 Saraon P Grozavu I Lim SH Snider J Yao Z Stagljar I Detecting Membrane Protein-protein Interactions Using the Mammalian Membrane Two-hybrid (MaMTH) Assay Current protocols in chemical biology 20179(1)38-54 47 Snider J Kittanakom S Curak J Stagljar I Split-ubiquitin based membrane yeast two-hybrid (MYTH) system a powerful tool for identifying protein-protein interactions Journal of visualized experiments JoVE 2010(36) 48 Stynen B Tournu H Tavernier J Van Dijck P Diversity in genetic in vivo methods for protein-protein interaction studies from the yeast two-hybrid system to the mammalian split-luciferase system Microbiology and molecular biology reviews MMBR 201276(2)331-82 49 Bruckner A Polge C Lentze N Auerbach D Schlattner U Yeast two-hybrid a powerful tool for systems biology International journal of molecular sciences 200910(6)2763-88 50 Snider J Kotlyar M Saraon P Yao Z Jurisica I Stagljar I Fundamentals of protein interaction network mapping Mol Syst Biol 201511(12)848 51 Vidal M Fields S The yeast two-hybrid assay still finding connections after 25 years Nat Methods 201411(12)1203-6 52 Johnsson N Varshavsky A Split ubiquitin as a sensor of protein interactions in vivo Proceedings of the National Academy of Sciences of the United States of America 199491(22)10340-4 53 Stagljar I Fields S Analysis of membrane protein interactions using yeast-based technologies Trends in biochemical sciences 200227(11)559-63 54 Michnick SW Exploring protein interactions by interaction-induced folding of proteins from complementary peptide fragments Current opinion in structural biology 200111(4)472-7 55 Tarassov K Messier V Landry CR Radinovic S Serna Molina MM Shames I et al An in vivo map of the yeast protein interactome Science 2008320(5882)1465-70 56 Freschi L Torres-Quiroz F Dube AK Landry CR qPCA a scalable assay to measure the perturbation of protein-protein interactions in living cells Molecular bioSystems 20139(1)36-43 57 Rochette S Diss G Filteau M Leducq JB Dube AK Landry CR Genome-wide protein-protein interaction screening by protein-fragment complementation assay (PCA) in living cells J Vis Exp 2015(97) 58 Chen X Zaro JL Shen WC Fusion protein linkers property design and functionality Advanced drug delivery reviews 201365(10)1357-69 59 Yu K Liu C Kim BG Lee DY Synthetic fusion protein design and applications Biotechnology advances 201533(1)155-64 60 Petschnigg J Snider J Stagljar I Interactive proteomics research technologies recent applications and advances Curr Opin Biotechnol 201122(1)50-8 61 Stryer L Haugland RP Energy transfer a spectroscopic ruler Proceedings of the National Academy of Sciences of the United States of America 196758(2)719-26 62 Stryer L Fluorescence energy transfer as a spectroscopic ruler Annual review of biochemistry 197847819-46 63 Piehler J New methodologies for measuring protein interactions in vivo and in vitro Current opinion in structural biology 200515(1)4-14

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 62: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

49

64 Back JW de Jong L Muijsers AO de Koster CG Chemical cross-linking and mass spectrometry for protein structural modeling J Mol Biol 2003331(2)303-13 65 Leitner A Faini M Stengel F Aebersold R Crosslinking and Mass Spectrometry An Integrated Technology to Understand the Structure and Function of Molecular Machines Trends in biochemical sciences 201641(1)20-32 66 Rappsilber J The beginning of a beautiful friendship cross-linkingmass spectrometry and modelling of proteins and multi-protein complexes J Struct Biol 2011173(3)530-40 67 Vasilescu J Guo X Kast J Identification of protein-protein interactions using in vivo cross-linking and mass spectrometry Proteomics 20044(12)3845-54 68 Roux KJ Kim DI Raida M Burke B A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells The Journal of cell biology 2012196(6)801-10 69 Remy I Wilson IA Michnick SW Erythropoietin receptor activation by a ligand-induced conformation change Science 1999283(5404)990-3 70 Botstein D Fink GR Yeast an experimental organism for 21st Century biology Genetics 2011189(3)695-704 71 Gagnon-Arsenault I Marois Blanchet FC Rochette S Diss G Dube AK Landry CR Transcriptional divergence plays a role in the rewiring of protein interaction networks after gene duplication J Proteomics 201381112-25 72 Vo TV Das J Meyer MJ Cordero NA Akturk N Wei X et al A Proteome-wide Fission Yeast Interactome Reveals Network Evolution Principles from Yeasts to Human Cell 2016164(1-2)310-23 73 Arabidopsis Interactome Mapping C Evidence for network evolution in an Arabidopsis interactome map Science 2011333(6042)601-7 74 Filteau M Vignaud H Rochette S Diss G Chretien AE Berger CM et al Multi-scale perturbations of protein interactomes reveal their mechanisms of regulation robustness and insights into genotype-phenotype maps Briefings in functional genomics 2015 75 Sahni N Yi S Zhong Q Jailkhani N Charloteaux B Cusick ME et al Edgotype a fundamental link between genotype and phenotype Curr Opin Genet Dev 201323(6)649-57 76 Yang X Coulombe-Huntington J Kang S Sheynkman GM Hao T Richardson A et al Widespread Expansion of Protein Interaction Capabilities by Alternative Splicing Cell 2016164(4)805-17 77 Bisson N James DA Ivosev G Tate SA Bonner R Taylor L et al Selected reaction monitoring mass spectrometry reveals the dynamics of signaling through the GRB2 adaptor Nat Biotechnol 201129(7)653-8 78 Ori A Iskar M Buczak K Kastritis P Parca L Andres-Pons A et al Spatiotemporal variation of mammalian protein complex stoichiometries Genome Biol 20161747 79 Rochette S Gagnon-Arsenault I Diss G Landry CR Modulation of the yeast protein interactome in response to DNA damage Journal of proteomics 201410025-36 80 Grossmann A Benlasfer N Birth P Hegele A Wachsmuth F Apelt L et al Phospho-tyrosine dependent protein-protein interaction network Mol Syst Biol 201511(3)794 81 Landry CR Levy ED Abd Rabbo D Tarassov K Michnick SW Extracting insight from noisy cellular networks Cell 2013155(5)983-9 82 Wan C Borgeson B Phanse S Tu F Drew K Clark G et al Panorama of ancient metazoan macromolecular complexes Nature 2015525(7569)339-44 83 Kristensen AR Gsponer J Foster LJ A high-throughput approach for measuring temporal changes in the interactome Nat Methods 20129(9)907-9

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709

Page 63: Mesurer les associations protéiques à proximité in …...Mesurer les associations protéiques à proximité in vivo en utilisant la complémentation de fragments protéiques Mémoire

50

84 Benschop JJ Brabers N van Leenen D Bakker LV van Deutekom HW van Berkum NL et al A consensus of core protein complex compositions for Saccharomyces cerevisiae Molecular cell 201038(6)916-28 85 Ideker T Krogan NJ Differential network biology Mol Syst Biol 20128565 86 Baker M Proteomics The interaction map Nature 2012484(7393)271-5 87 Michnick SW Ear PH Manderson EN Remy I Stefan E Universal strategies in research and drug discovery based on protein-fragment complementation assays Nat Rev Drug Discov 20076(7)569-82 88 Robinson CV Sali A Baumeister W The molecular sociology of the cell Nature 2007450(7172)973-82 89 Michnick SW Ear PH Landry C Malleshaiah MK Messier V A toolkit of protein-fragment complementation assays for studying and dissecting large-scale and dynamic protein-protein interactions in living cells Methods Enzymol 2010470335-68 90 Ear PH Michnick SW A general life-death selection strategy for dissecting protein functions Nat Methods 20096(11)813-6 91 Remy I Michnick SW Mapping biochemical networks with protein fragment complementation assays Methods Mol Biol 20151278467-81 92 Stefan E Aquin S Berger N Landry CR Nyfeler B Bouvier M et al Quantification of dynamic protein complexes using Renilla luciferase fragment complementation applied to protein kinase A activities in vivo Proc Natl Acad Sci U S A 2007104(43)16916-21 93 Tchekanda E Sivanesan D Michnick SW An infrared reporter to detect spatiotemporal dynamics of protein-protein interactions Nat Methods 201411(6)641-4 94 Kerppola TK Visualization of molecular interactions using bimolecular fluorescence complementation analysis characteristics of protein fragment complementation Chem Soc Rev 200938(10)2876-86 95 Gibson TJ One-step enzymatic assembly of DNA molecules up to several hundred kilobases in size Nature Protocol Exchange 2009 Available from httpwwwnaturecomprotocolexchangeprotocols554 96 Chatr-Aryamontri A Oughtred R Boucher L Rust J Chang C Kolas NK et al The BioGRID interaction database 2017 update Nucleic Acids Res 201745(D1)D369-D79 97 Haarer B Aggeli D Viggiano S Burke DJ Amberg DC Novel interactions between actin and the proteasome revealed by complex haploinsufficiency PLoS Genet 20117(9)e1002288 98 Guerrero C Milenkovic T Przulj N Kaiser P Huang L Characterization of the proteasome interaction network using a QTAX-based tag-team strategy and protein interaction network analysis Proc Natl Acad Sci U S A 2008105(36)13333-8 99 Archambault J Friesen JD Genetics of eukaryotic RNA polymerases I II and III Microbiol Rev 199357(3)703-24 100 Leitner A Walzthoeni T Aebersold R Lysine-specific chemical cross-linking of protein complexes and identification of cross-linking sites using LC-MSMS and the xQuestxProphet software pipeline Nat Protoc 20149(1)120-37 101 Vogel SS van der Meer BW Blank PS Estimating the distance separating fluorescent protein FRET pairs Methods 201466(2)131-8 102 Anderson P Kedersha N Ivanov P Stress granules P-bodies and cancer Biochimica et biophysica acta 20151849(7)861-70 103 Beckham CJ Parker R P bodies stress granules and viral life cycles Cell host amp microbe 20083(4)206-12 104 Nathans R Chu CY Serquina AK Lu CC Cao H Rana TM Cellular microRNA and P bodies modulate host-HIV-1 interactions Molecular cell 200934(6)696-709