conrad et al 2007 gene duplication

Upload: dariel-marquez

Post on 04-Jun-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    1/21

    Gene Duplication: A Drivefor Phenotypic Diversity andCause of Human Disease

    Bernard Conrad1,2 and Stylianos E. Antonarakis1

    1Department of Genetic Medicine & Development, University of Geneva MedicalSchool and Geneva University Hospitals, CH-1211 Geneva 4, Switzerland

    2Division of Human Genetics, Bern University Childrens Hospital, CH-3010 Bern,Switzerland; email: [email protected]

    Annu. Rev. Genomics Hum. Genet. 2007. 8:1735

    First published online as a Review in Advance onMarch 26, 2007.

    TheAnnual Review of Genomics and Human Geneticsis online at genom.annualreviews.org

    This articles doi:10.1146/annurev.genom.8.021307.110233

    Copyright c2007 by Annual Reviews.All rights reserved

    1527-8204/07/0922-0017$20.00

    Key Words

    gene duplication, copy number variant, haploinsufficiency, gene

    balance hypothesis, insufficient amount hypothesis

    Abstract

    Gene duplication is one of the key factors driving genetic inno-vation, i.e., producing novel genetic variants. Although the con

    tribution of whole-genome and segmental duplications to pheno-typic diversity across species is widely appreciated, the phenotypic

    spectrum and potential pathogenicity of small-scale duplications inindividual genomes are less well explored. This review discusses

    the nature of small-scale duplications and the phenotypes produced by such duplications. Phenotypic variation and disease phe-

    notypes induced by duplications are more diverse and widespreadthan previously anticipated, and duplications are a major class o

    disease-related genomic variation. Pathogenic duplications particu-larly involve dosage-sensitive genes with both similar and dissimilar

    over- and underexpression phenotypes, and genes encoding protein

    with a propensity to aggregate. Phenotypes related to human-specificcopy number variation in genes regulating environmental response

    and immunity are increasingly recognized. Small genomic duplications containing defense-related genes also contribute to complex

    common phenotypes.

    17

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    2/21

    INTRODUCTION

    Ever since Susumu Ohnos insightful sugges-tion 35 years ago that gene duplication is a key

    factor shaping evolution, the model and itsgeneral predictions continue to attract much

    attention (70). On an evolutionary scale,gene duplication may result in new functions

    via different scenarios (Figure 1). Althoughthe most likely outcome is loss of function

    in one of the two gene copies (Figure 1a,

    Gene loss = nonfunctionalization

    Functional divergence

    a

    b

    No functional divergence = genetic robustnessc

    Neofunctionalization Subfunctionalization

    d

    e

    Duplication of gene families

    Concerted evolution

    Gene

    conversion

    Birth-and-death evolution

    Silent nucleotide substitutionsversus

    inactivating mutations

    Duplication by retrotransposition

    Male germline function

    Somatic function

    Figure 1

    Evolutionary fate of single gene duplications (ac), and duplication of multigene families (de). Singlegene duplication most often results in a nonfunctional duplicate gene copy (a, nonfunctionalization).(b) In rare instances, the functional duplicate gene copy and the ancestral gene diverge in function;

    neofunctionalization means that one of the two genes retains the original function, while the otherevolves a new, often beneficial function. Subfunctionalization implies that both the original and theduplicate genes mutate and evolve to fulfill complementary functions already present in the original geneDuplication via retrotransposition represents a particular case of sub- or neofunctionalization. Multigenefamilies evolve in a coordinated fashion, such that the DNA coding sequences and function of the singlemembers of a family remain close to that of the ancestral gene (de). (d) Concerted evolution: Aftermultiple rounds of duplication, gene conversion homogenizes the DNA sequences of the individualmembers. (e) Birth-and-death evolution invokes a process of equilibrium between inactivating mutationsand ongoing duplication of functional gene copies.

    nonfunctionalization), in rare instances one

    gene copy may retain the original functionwhile the other acquires a novel, evolu-

    tionarily advantageous (adaptive) function(Figure 1b, neofunctionalization) (30). Al-

    ternatively, after duplication, mutations mayoccur in both genes that specialize to per-

    form complementary functions (Figure 1b

    subfunctionalization) (56, 57). The ques-tion of how duplicate genes are retained

    18 Conrad Antonarakis

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    3/21

    in a population remains controversial.

    Classical duplication-degeneration-comple-mentation/subfunctionalization models do

    not invoke positive selection, but stipulatea higher retention rate of duplicate genes in

    small rather than larger populations. Con-siderably more retentions and fewer losses of

    duplicate genes in rodents as compared withhumans indicate that positive selection mayplay a more important role than originally

    anticipated (91). If two redundant gene copieswere retained in the genome without signifi-

    cant functional divergence, the organism mayacquire increased genetic robustness against

    harmful mutations (Figure 1c). In multigenefamilies descended from a common ancestor,

    individual genes in the group exert similarfunctions and have similar DNA sequences

    (67, 68). One concept, concerted evolution,applies particularly to localized and typically

    tandem copies of a gene. The concept

    posits that all genes in a given group evolvecoordinately, and that homogenization is

    the result of gene conversion (Figure 1d).For most multigene families, the currently

    favored model is birth-and-death evolution,according to which similarity in protein

    sequence among the members of a family isassured by strong purifying selection, such

    that individual genes evolve essentially viasilent synonymous nucleotide substitutions

    (67, 68) (Figure 1e). Inactivating mutationsin a single member of a multigene family

    do not necessarily imply an evolutionary

    dead end, as illustrated by the more is lesshypothesis (72). For instance, the human

    CASP12 pseudogene located in a clusterof functional Caspase genes shows that a

    protein-truncating mutation can be positivelyselected, probably because the variant allele

    confers resistance to severe sepsis (105,108). A recently recognized primate-specific

    subgroup of duplications generated by retro-transposition was recruited to enhance male

    germline function (Figure 1b) (103). Thismechanism thus represents a variation of

    neo- or subfunctionalization. Irrespective of

    these alternatives, gene duplication is, along

    Whole-genomeduplication and 2Rhypothesis:increased complexityand genome size ofvertebrates resulted

    from two rounds(2R) ofwhole-genomeduplication duringearly vertebrateevolution

    Small-scaleduplications: arecent gene familyexpansion bysegmental or tandemduplication50150 Mya thatmay have followedon one singleprecedent round(1R) of WGD

    Gene balancehypothesis: positsthat an imbalance inthe concentration ofindividualcomponents of amultiproteincomplex isdeleterious

    with alternative splicing, a major mechanism

    of gene diversification (48, 99). For instance,along the lineage leading to humans, 689

    genes were gained and 86 genes lost sincethe split from chimpanzees, contributing

    to a 6% difference in the complement ofgenes between humans and chimpanzees

    (1418 of 22,000 genes), largely outnumber-ing the 1.5% nucleotide difference betweenorthologous sequences of the two species (23).

    Whole-Genome versus Small-ScaleDuplications

    Recent analysis of eukaryotic genomes showsthat gene duplication is widespread (50, 56).

    A series of arguments suggest that currentvertebrate genomes have been shaped by

    two rounds of whole-genome duplication (22,100). An alternative hypothesisfavors one sin-gle round of whole-genome duplication plus

    continuous small-scale duplications (35). Us-ing Arabidopsis as a model system that un-

    derwent several rounds of complete genomeduplication, the fate of small-scale versus

    whole-genome duplication was addressed.This analysis not only revealed the dominant

    evolutionary impact of whole-genome dupli-cation, but also the important role of small-

    scale duplications for gene categories relatedto metabolism, stress responses,and celldeath

    (58). The functional gene categories that have

    been selectively maintained by the two mech-anisms differ somewhat (20). For instance, a

    group of genes also referred to as connectedgenes, encoding among other transcriptional

    regulators with limiting downstream part-ners that participate in protein-protein in-

    teractions, and interacting proteins that arepart of signal transduction pathways (31), are

    consistently underrepresented among small-scale duplications, and overrepresented after

    whole-genome duplication (76). Duplicationof such dosage-sensitive genes, required at

    stoichiometrically precise levels, may be

    tolerated in whole-genome, but not insmall-scale, duplications (20). These findings

    comply with the gene balance hypothesis

    www.annualreviews.org Gene Duplication 19

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    4/21

    Recent segmentalduplications:sequences that are>1 kb in length andthat show>90%sequence identity

    Breakpoint inconserved synteny:syntenic segments orblocks 100 kb withbreakpointsidentified as changein orientation orchromosomelocation based onunique regions of thegenome

    Copy number

    variant (CNV) orcopy numberpolymorphism(CNP): a DNAsegment1 kbpresent at variablecopy numbercompared with areference genome

    Copy numbervariable region(CNVR): CNVsidentified in

    individuals of theHapMap collection,called in more thanone individual, andreplicated by asecond independentplatform

    (9, 10, 31, 76, 102), which posits that con-

    nected genes are particularly dosage sensitive,and are expected to be overretained relative

    to nondosage-sensitive genes after whole-genome and larger-scale duplications, but not

    after local duplications.

    Recent Segmental Duplications

    Recent segmental duplications comprise

    about 5% of the euchromatic portion of thehuman genome (26). Although one current

    definition includes sequences that are >1 kbin length and that show>90% sequence iden-

    tity (6, 16), other authors use the term vari-ably, giving rise to different numbers of du-

    plicated segments in the genome. Segmentalduplications are particularly enriched at peri-

    centromeric and subtelomeric regions; sub-telomeric duplications account for 40% (51),and pericentromeric duplications for 33%

    of the total (90). A considerable fraction(25%) of the nonpericentromeric and non-

    subtelomeric duplications are associated withsyntenic breaks (4, 5), and 98% of primate-

    specific breakpoints contain segmental du-plications (66). This was recently confirmed

    for the duplication-rich human chromosomes15 (113) and 17 (114) (HSA15, HSA17).

    On these chromosomes, the human-specificbreakpoints of conserved synteny (breakpoint

    identified as changein orientation or chromo-

    some location based on unique regions of thegenome, and for syntenic segments 100 kb)

    (5) occur mostly in regions containing dupli-cations; 13 out of 15 breakpoints on HSA15

    contain duplications, and 74% of duplicatedbases on HSA17 reside in breakpoints. At least

    50% of duplications are strictly intrachromo-somal [50% for HSA15 (110) and 62% for

    HSA17 (111)]. Between 3.5% and 11% ofall duplications contain complete genes (110),

    and nearly one third of duplicated genes arearrayed in tandem (92). An unexpected fea-

    ture of areas with tandem duplications is theirasynchronous replication, suggesting that du-

    plicate structures alter the epigenetic state of

    a given locus (33). Ultraconserved elements

    (UCEs) arerare in segmental duplications and

    copy number variants (CNVs), except for theUCE-subclass that overlaps exons (24). The

    mechanisticbasis for this exquisite intoleranceto copy number changes of UCEs awaits fur-

    ther characterization.At least 30% of the pericentromeric du-

    plications were duplicatively transposed fromeuchromatic regions, generating a minimum

    of 28 new transcripts that are expressed pri-

    marily in the testis (90). The overall propor-tion of duplications is highest close to cen-

    tromeres, and gradually diminishes within adistance of5 Mb from centromeres. Con-

    versely, gene density increases with distancefrom centromeric -satellite repeats. The

    subtelomeres contain 25 small gene familiesorganized in tandemly repeated blocks shar-

    ing extensive sequence similarity, and 18 ofthese have at least one functional member

    (51). Gene products within these blocks ex-ert highly varied functions, but are predomi-

    nantlyodorant-andcytokine-receptors,tubu-

    lins, and transcription factors.Several recent studies revealed the

    widespread presence of large-scale CNVsin the normal human population (85, 88)

    Deletions and insertions were equally rep-resented, often occurring around regions

    of chromosomal instability (88). In onestudy, 70 different genes were shown to

    vary in copy number within CNVs (85)Similarly, extensive CNVs in subtelomeric

    blocks were documented (51). Approxi-mately 1450 copy number variable regions

    (CNVRs) encompassing 360 Mbs, or 12%

    of the genome, were mapped through thestudy of 270 individuals from the HapMap

    collection (80), and 5150 CNVs genome-wide are currently recorded in the available

    databases [http://paralogy.gs.washingtonedu/structuralvariation/ (85), http://

    projects.tcag.ca/variation/(43), http://www.som.soton.ac.uk/research/geneticsdiv/

    anomaly%20register/]. Within humanCNVs, a significant overrepresentation

    of genes associated with environmentallyregulated functions and immunity was found

    20 Conrad Antonarakis

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    5/21

    suggesting an adaptive advantage of dosage

    imbalance in these regions (69). Mecha-nistically, duplications that share extensive

    sequence similarity, called low copy repeats(LCRs), occur frequently in regions with

    large duplicated stretches. One commondefinition of LCRs is sequences that are

    10 kb in length, show 95% sequenceidentity, and are separated by 50 kb10 Mbof intervening sequence (97). Such LCRs

    serve as substrates for nonallelic homologousrecombination (NAHR) that can result in

    duplications, reciprocal deletions, inversions,and reciprocal translocations (83, 89). The

    primate-specific burst of Alu-repeats 3540 million years ago (Mya) could have been

    one critical event initiating segmental geneduplications (7). Consistent with this, the

    generation and structure of LCRs appear tobe associated with Alu-elements (89), and 492

    human-specific deletions can be attributed to

    this process (87). A detailed analysis of theproximal short arm of HSA17 indicated that

    NAHR between LCRs is a major mechanismof recurrent rearrangements, whereas nonho-

    mologous end-joining (NHEJ) can, in manyinstances, be responsible for nonrecurrent

    rearrangments (54). The prevailing molecularmechanism also depends on the chromosomal

    position, because subtelomeric duplicationswere generated almost exclusively via NHEJ

    of double-strand breaks (51).Functional, duplicated genes comprise an

    important, rapidly evolving euchromatic frac-

    tion of our genome that displays extensivepolymorphism; it is therefore importantto ex-

    amine their contribution to phenotypic vari-ation and disease.

    Evolutionary Forces and Duplicated

    Genes

    The evolutionary forces acting on duplicatedgenes are diverse. A number of interdepen-

    dent variables determine whether a gene will

    be retained after duplication; these includeits functional category (46, 61, 76), degree

    of conservation (18, 19, 45), sensitivity to

    Low copy repeats(LCRs): sequencesthat are 10 kb inlength that show95% sequenceidentity, and are

    separated by50 kb10 Mb ofintervening sequence

    Haploinsufficientgenes: subset ofgenes experiencing aloss of fitness whenpresent in a singlecopy in diploidspecies

    dosage effects (46), as well as its regulatory

    and architectural complexity (39). In general,genes encoding proteins that interact with

    the environment tend to be more frequentlyretained after duplication than those that

    interact with intracellular compartments (50,61). Genes with permanent duplicates in

    the Caenorhabditis elegans and Saccharomycescerevisiae genomes are more constrainedbefore duplication than genes that never

    duplicated (19). However, human CNVs areconsistently enriched in genes with increased

    rates of synonymous and nonsynonymouscodon substitutions (69). Haploinsufficient

    genesi.e., genes experiencing a loss offitness when present in a single copy in

    diploid specieshave more paralogs thanhaplosufficient genes, supporting the concept

    that gene dosage may be critical in fixing du-plications (gene balance hypothesis) (46). In

    general, duplicated genes encode for longer

    proteins containing more domains and morecis-regulatory elements than singleton genes

    (39). These observations indicate that naturalselection created a preferential association of

    duplications with certain gene categories.Experiments performed on multiple

    species, from unicellular organisms to mam-mals, indicate that genes rapidly diverge

    after duplication (Figure 2) (17, 47). Al-though divergence of duplicated genes was

    traditionally assessed by comparing the rateof amino acid changes, more recently the

    focus was on divergence of gene expres-

    sion, of transcriptional networks, and ofprotein-protein interactions. It has been

    shown that the protein-coding sequence andcis-regulatory regions of duplicated genes

    evolve independently (Figure 2a) (104).Shortly after duplication, expression and

    regulatory divergence exceed changes inthe protein sequence (36). In humans and

    yeast, protein sequence evolution and geneexpression diversity significantly correlate

    and increase with evolutionary time (37, 60).In addition, the mode of duplication and

    the functions of the genes involved also play

    important roles in expression divergence

    www.annualreviews.org Gene Duplication 21

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    6/21

    b1 Protein-sequence divergence Regulatory divergencec

    Protein-network divergence

    cis-regulatory

    divergence

    t

    (Evolutionary time)

    acis-regulatory region

    Protein-coding region

    b2

    Figure 2

    Functional divergence of duplicated genes. (a) Thecis-regulatory and the protein-coding regions evolveindependently after duplication. Divergence increases with evolutionary distance. (b) Schematicrepresentation of (b1) protein sequence divergence after duplication (green sequence diverges intoblueororange); (b2) protein network divergence, where the protein interaction domains of the original greensequence evolve by maintenance, gain, or loss of interacting partners. (c) Schematic representation ofDNA sequence regulatory divergence after duplication (certain regulatory motifs are lost in one copy ofthe duplicated gene sequence).

    (13). Divergence of cis-regulatory motifs in

    the promoter-proximal region (Figure 2c) isprobably not the only substrate for expres-

    sion divergence, suggesting thattrans-actingfactors could also be important (111). In

    humans, genes diverge asymmetrically afterduplication; they rapidly lose their original

    coexpressed partners and acquire new ones(Figure 2c) (17). Expression levels of du-

    plicate genes diverge significantly duringdevelopment within and between species,

    as compared to single-copy genes (37).

    Paralogous genes in humans and mice tend tobecome more specialized in their expression

    patterns (42).The protein interaction partners

    (Figure 2b) change at a slower rate than thetranscription factors shared by duplicated

    genes (62). The divergence of protein-protein

    interactions after duplication depends on the

    connectivity [i.e., number of binding partnersof a protein (8)] of the ancestral gene (112)

    Proteins with a higher ancient connectivitytend to display an asymmetrical evolution

    of the duplicates, whereas duplicates witha lower connectivity tend to gain and lose

    interacting partners at about the same rate. Inconclusion, gene expression divergence is a

    key substrate for functional divergence of du-plicate genes, and gene regulatory networks

    evolve at a higher rate after duplication than

    protein-protein interactions.

    The Phenotypic Spectrumof Duplicated Genes

    Gene dosage effects. The phenotypic con-

    sequence of duplicating large regions or even

    22 Conrad Antonarakis

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    7/21

    whole chromosomes is at least in part de-

    termined by the extent of regulatory imbal-ances. It is a priori expected that duplicate

    genes (three copies per diploid genome) willexhibit a 1.5-fold increase in mRNA expres-

    sion. Consistent with the prediction, 50%of trisomic genes are overexpressed at the ex-

    pected 1.5 level or higher; this is true forgenes overexpressed as a consequence of anadditional whole chromosome copy in hu-

    man trisomy 21 and in the respective trisomicmouse models (81). Similarly, a recent com-

    parison of chimpanzee and human segmentalduplications revealed that among the human-

    specific duplicates (causing trisomies), 56%show a significant difference in gene expres-

    sion between the two species, mostly (83%)overexpression in human as compared to

    chimpanzee (15).For genes that fail to show such a dose-

    dependent increase, compensatory mecha-

    nisms are usually involved (i.e., dosage com-pensation). For instance, this can happen

    when the regulator controlling transcriptionand the target gene reside together on an

    aneuploid segment, canceling dosage imbal-ances (9). Conversely, duplicated regions con-

    taining positive or negative regulators canproduce negative effects on target genes lo-

    cated outside of the aneuploid area (9, 10).The main conclusion from this is that the

    most significant alteration in gene expressioncaused by aneuploidy will be exerted by the

    target genes that are not in the aneuploid

    region.In aneuploidy, it is generally assumed that

    only a restricted set of dosage-sensitive genesis responsible for the phenotype. In a sys-

    tematicanalysis of overexpression phenotypesin yeast, 15% of overexpressed genes re-

    duced cell growth, and among those, cellcycle-regulated genes, signaling molecules,

    and transcription factors were enriched (96).Interestingly, the overexpression phenotypes

    differed from the deletion mutant pheno-types, indicating that the underlying mech-

    anisms are specific for under- and overex-

    pression, respectively, rather than resulting

    UNDER- AND OVEREXPRESSIONPHENOTYPES: THE GENE BALANCEHYPOTHESIS VERSUS THE INSUFFICIENT

    AMOUNT HYPOTHESIS

    Another extrapolation of the gene balance hypothesis is that

    under- and overexpression phenotypes are identical, or at leastsimilar, as they both act via the disruption of an identical mul-

    timeric, regulatory protein complex (disrupted stoichiometryof protein subunits encoded by connected genes).

    The primary mechanism of haploinsufficiency in yeast is

    an insufficient protein production, which led to the formula-tion of the insufficient amount hypothesis, contradicting the

    predictions of the balance hypothesis. In further support ofthe insufficient amount hypothesis is the fact that under- and

    overexpression phenotypes of bona fide connected genes inyeast differ, indicating specific regulatory imbalances, rather

    than disrupted stoichiometry (see References 25 and 96).

    from a common disruption of protein com-plex stoichiometry. These results contradict

    the gene balance hypothesis and are consis-

    tent with the insufficient amounts hypothe-sis (25), in which haploinsufficient genes are

    needed at abnormally high levels, and there-fore are more sensitive to a reduction in dose

    (see sidebar on Under- and OverexpressionPhenotypes: The Gene Balance Hypothesis

    versus the Insufficient Amount Hypothesis formore information).

    Relationship of gene dosage and fitness

    (phenotype). The relationship betweengene dosage and fitness (phenotype) is

    complex. Three alternatives have beenproposed, each one characteristic of certain

    gene categories (Figure 3) (46). The firstdescribes a linear relationship that is often

    found for structural and regulatory proteins(Figure 3a); the second fits a diminish-

    ing returns principle typical of enzymesthat function at limiting concentrations

    (Figure 3b). Consistent with this, disorderscaused by genes encoding enzymes are pri-

    marily recessive, indicating that these genes

    are not dosage sensitive (haplosufficient)

    www.annualreviews.org Gene Duplication 23

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    8/21

    Protein concentration(genotype)

    Fitness

    (phenotype)

    0 0.5 1 2

    (aa) (aA) (AA) (AA,AA)

    Protein concentration(genotype)

    Fitness

    (phenotype)

    0 0.5 1 2

    (aa) (aA) (AA) (AA,AA)

    Protein concentration(genotype)

    Fitn

    ess

    (phen

    otype)

    0 0.5 1 2

    (aa) (aA) (AA) (AA,AA)

    Protein concentration(genotype)

    Fitn

    ess

    (phen

    otype)

    0 0.5 1 2

    (aa) (aA) (AA) (AA,AA)

    Examples

    Examples

    a

    c

    Linear function

    Stoichiometric titration

    Structural-,regulatory-

    proteins

    Examples

    Proteins with pro-

    pensity to aggregate(Parkinson/Alzheimerdisease)(see Table 1)

    d Aggregation

    Examples

    b Diminishing returns function

    Enzymes(see Table 1)

    Dosage-dependent

    transcription factors(see Table 1)

    Figure 3

    Schematic representations of gene dosage and fitness (phenotype). Four different relationships areshown. (a) A linear relationship is typically found for structural and regulatory proteins; (b) a diminishingreturns function classically involves enzymes that show little variation in function over large dose ranges.(c) Certain functional gene classes enriched for transcriptional regulators and signaling molecules causephenotypes both when under- and overexpressed (haploinsufficiency and pathogenic gene duplication).(d) Disease phenotypes caused by protein aggregation. The wild-type protein aggregates at doses beyondthe threshold level of two gene copies. Heterozygous mutations can also cause protein aggregation at onegene copy. Uppercase A depicts the normal allele; lowercase a represents a mutant loss of functionallele.

    Haploinsufficiency:a dominantphenotype in adiploid organismthat is heterozygousfor a loss of functionallele

    (44). The third alternative corresponds

    to a diminished fitness for both increasedand decreased gene dosage, indicating ei-

    ther multisubunit complexes with a singlecomponent that has a tight stoichiometry

    (46) (gene balance hypothesis), or specificregulatory imbalances as a consequence of

    under- (insufficient amount hypothesis) andoverexpression (25) (Figure 3c). Transcrip-

    tional regulators can cause phenotypes both

    when under- and overexpressed (84), a mech-anism responsible for many developmental/

    malformation syndromes (Figure 3c) (Ta-

    ble 1). A fourth alternative concerns proteinswith a propensity to aggregate (Figure 3d)

    These proteins display a dual behaviorThe wild-type protein aggregates in a

    dose-dependent manner once the diploidthreshold dose is exceeded (e.g., gene du-

    plication). Alternatively, aggregation occursin the diploid state in the presence of a

    dominant mutation (Figure 3d) (49). Genesencoding such proteins include -synuclein

    (SNCA), responsible for one rare familial

    24 Conrad Antonarakis

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    9/21

    form of early-onset Parkinson disease, and

    the amyloid precursor protein (APP) thatcauses one form of dominant early-onset

    Alzheimer disease (38, 93).

    Synopsis of diseases and mechanisms asso-

    ciated with duplicated genes. In Table 1,

    we summarized the main features of pheno-types and diseases caused by duplicated genes.

    Examples of pathogenic duplications in-

    volving dosage-sensitive genes. The gen-

    eral models discussed predict that among thegenes overexpressed as a result of duplication,

    a minor fraction of dosage-sensitive genesenriched for transcription factors, signaling

    molecules, and cell cycle-regulated genes willbe critical for the phenotypic features (96).

    Competing models stipulate that over- andunderexpression phenotypes of such genes areeither similar (gene balance hypothesis), or al-

    ternatively distinct (insufficient amounts hy-pothesis). In Charcot-Marie-Tooth (CMT)

    disease, the most frequent inherited disor-der of the peripheral nervous system (55,

    65), a 1.4-Mb tandem duplication is foundin 70% of autosomal dominant CMT1A

    patients. Although the 1.4-MB interval con-tains 30 genes,PMP22,encoding the major

    myelin protein, is responsible for the disorderthrough a gene dosage effect. In the heterozy-

    gous duplication state,PMP22trisomy causesa nerve conduction deficit due to demyliniza-

    tion (CMT1A), whereasPMP22 monosomy

    (deletion) causes a nerve conduction block[hereditary neuropathy with liability to pres-

    sure palsies (HNPP)] in its haploinsufficientstate. Because many central and peripheral

    demyelinating disorders, such as CMT1A,Pelizaeus-Merzbacher disease (PMD) (27),

    and autosomal dominant leukodystrophy(ADLD) (75), can be caused by gene duplica-

    tion, it was suggested that myelin formationis particularly susceptible to gene dosage ef-

    fects (75). The CMT/HNPP-causing regionshows how two distinct phenotypes can result

    from over- and underexpression of a critical

    dosage-sensitive gene. Similarly, deletion and

    duplication of the Sotos syndrome region and

    NSD1gene dosage effects cause contrastingphenotypes (14).

    Over- and underexpression phenotypescan also be similar, as shown by the methyl-

    CpG-binding protein 2 (MECP2), a chro-matin architectural protein linked to tran-

    scriptional repression (52). The progressiveneurodegenerative disorder Rett syndrome,

    which in its classical form affects females

    almost exclusively, is due to MECP2 hap-loinsufficiency. Severe mental retardation and

    neurological symptoms with features of Rettsyndrome in males can also be caused by

    MECP2duplications (101). Likewise, abnor-mal neurodevelopmental phenotypes linked

    to both MECP2 under- and overexpressionwere described in human and transgenic

    mouse models (53). Similarly, X-linked hy-popituitarism can be caused by inactivating

    heterozygous SOX3 mutations, or by duplica-tion of a region containing the developmental

    transcription factorSOX3(Table 1) (107).

    Pathogenic duplications also occur in re-gions of genomic microdeletions, such as

    the velocardiofacial, the Williams-Beuren,the Alagille, and the Smith-Magenis syn-

    drome regions (Table 1) (64, 79, 95, 109).These examples partially fit the model

    that duplication of one or a few dosage-sensitive genes causes overexpression pheno-

    types that resemble the underexpression phe-notype (Figure 3c). On the other hand, the

    duplication at 1p36 matches an incremental/linear gene-phenotype model for closure of

    cranial sutures (Figure 3ab) (32). Hap-

    loinsufficiency of genes in this region re-sult in delayed closure of cranial sutures,

    whereas increased gene dosage via duplica-tion, for instance, results in craniosynosto-

    sis. A heterogenous group of duplications,spanning 0.51-Mb regions, causes pheno-

    types (a) for which the corresponding mono-somy has either not yet been reported (split

    hand/split foot syndrome, SHFM3) (21);(b) for which duplication and haploinsuffi-

    ciency of as-yet-unidentified genes have beenimplicated in a similar/identical phenotype

    www.annualreviews.org Gene Duplication 25

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    10/21

    Table 1 Duplication phenotypes

    Species GeneGene

    categoryGenomicalteration Disease Phenotype Mechanism References

    Similar under- and overexpression phenotypes

    Hs/Mm MECP2 MethylatedDNAbinding

    DuplicationMECP2

    Progressive neu-rodevelopmentalsyndrome inmales

    Mentalretardation,epilepsy

    Loss of genefunction due tounder- andoverexpression

    (52, 98)

    Hs SOX3 Developmentalregulator(TF)

    DuplicationSOX3

    X-linkedhypopituitarism(XLHP)

    X-linkedhypopituitarismand infundibularhypoplasia

    Loss of genefunction due tounder- andoverexpression

    (104)

    Hs ND (TBX1) ND Duplication22q11.2

    Velocardiofacialsyndrome(VCFS)

    Variable: normal todevelopmentaldelay andmalformations

    Loss of genefunction due tounder- andoverexpression

    (106)

    Hs ND (ELN) ND Duplication7q11.23

    Williams Beurensyndrome(WBS)

    Delay expressivelanguage

    Loss of genefunction due tounder- andoverexpression

    (92)

    Hs ND

    ( JAGGED1)

    Developmental

    regulator(TF)

    Duplication

    20p11

    Alagille syndrome

    (AS)

    Cardiovascular-,

    ocular-, bileduct-, andskeletal anomalies

    Loss of gene

    function due tounder- andoverexpression

    (63)

    Hs RAI1 Developmentalregulator(TF)

    Duplication17p11.2

    Smith-Magenissyndrome (SMS)

    Mild mentalretardation anddentalabnormalities

    Loss of genefunction due tounder- andoverexpression

    (77)

    Hs PLP1 Proteolipidprotein

    DuplicationPLP1

    Pelizaeus-Merzbacher(PM)

    Demyelinationdisorder CNS

    Loss of genefunction due tounder- andoverexpression

    (26)

    Dissimilar under- and overexpression phenotypes

    Hs PMP22 Myelinprotein

    DuplicationPMP22

    Charcot MarieTooth 1A

    (CMT1A)

    Peripheral myelinneuropathy

    Loss of genefunction due to

    under- andoverexpression

    (54, 64)

    Hs NSD1 Histonemethyltrans-ferase

    DuplicationNSD1

    Growthretardationsyndrome

    Growthretardation

    Loss of genefunction due tounder- andoverexpression

    (63)

    Hs ND(MMP23)

    ND Duplication1p36

    Premature closurecranial sutures

    Craniosynostosis Incremental genefunction

    (31)

    Complex, yet unresolved expression phenotypes

    Hs LMB1 Laminarnuclearenvelopeprotein

    DuplicationLMB1

    Autosomaldominantleukodystrophy(ADLD)

    Demyelinationdisorder CNS

    ND (73)

    Hs ND ND Duplication10q24

    Split hand/splitfootmalformation 3(SHFM3)

    Split hand/splitfoot

    ND (21)

    Hs ND ND Duplication2q13

    Orofacialclefting/cleftpalate only

    Mental retardationand orofacialclefting

    ND (72)

    (Continued)

    26 Conrad Antonarakis

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    11/21

    Table 1 (Continued)

    Species GeneGene

    categoryGenomicalteration Disease Phenotype Mechanism References

    Hs ND ND Duplication16p13

    ATR-X-like X-linked -thalassemia/mentalretardation

    ND (2)

    Protein aggregation due to overexpression

    Hs SNCA Molecular

    chaperone

    Duplication

    SNCA

    Parkinson disease Nigrostriatal

    neurondegeneration

    Protein

    aggregation andunderlying genemutations

    (91)

    Hs APP Amyloidprecursorprotein

    DuplicationAPP

    Alzheimer disease Parenchymal/vascularamyloiddeposition

    Proteinaggregation andunderlying genemutations

    (37, 79)

    Phenotypes of CNVs related to environment and immunity

    Hs CYP2D6 P-450isoenzyme

    P-450CNV Altered drugmetabolism

    Adverse drug effects Incremental/lineargene functionmodel

    (12, 40, 75)

    Hs CCL3L1 Chemokinereceptor

    CCL3L1CNV Altered HIVsusceptibility

    EnhancedHIV/AIDS

    susceptibility

    Incremental/lineargene function

    model

    (33)

    Common complex phenotypes of CNVs in defense-related genes

    Hs/Rn FCGR3B/Fcgr3-rs

    Fc receptorfor IgG

    FCGR3CNV Glomerulonephritis/systemic lupuserymathosus

    Susceptibility toglomerulonephri-tis

    Pathogenic un-derexpressionphenotype

    (1)

    Mm TLR7 Toll-likereceptor

    TLR7CNV Systemic lupuserymathosus-likedisease

    Autoantibody-elictedautoimmunity

    Pathogenicoverexpressionphenotype

    (76)

    Hs hBD2 Antimicrobialpeptides

    hBD2CNV Crohns disease ofthe colon

    Inflammatory boweldisease

    Pathogenic un-derexpressionphenotype

    (28)

    (mental retardation/orofacial clefting syn-

    drome) (74); and (c) for which haploinsuf-ficiency of a transcription factor (ATRX at

    Xq13.3) and duplication of a second regioncontaining candidateATRXtarget genes (re-

    gion at 16p13.11-16p13.3 comprising the -

    globin gene) result in a similar clinical outcome

    (X-linked -thalassemia/mental retardationsyndrome) (2). Although these examples are

    more complex than those previously dis-cussed, the last two cases may be compatible

    with similar under- and overexpression phen-toypes.

    Diseases caused by protein aggregation.

    Protein aggregation is particularly associatedwith disease genes (106). Two rare forms of

    neurodegenerative disorders, Parkinson and

    Alzheimer disease, provide constructive ex-

    amples of pathogenic gene dosage effects me-diated via protein aggregation (Figure 3d)

    (38, 93). Alpha-synuclein (SNCA) duplicationandtriplicationlead to increased expression of

    -synuclein, a small protein thought to playa role as a molecular chaperone in vesicular

    transport and/or turnover of synaptic vesicles.The pathology induced by this protein and the

    severity of disease both depend on the geneexpression level. Although the mechanism is

    not yet fully understood,-synuclein aggre-gation, controlled by mutations in at least

    three genes includingSNCAitself, is thought

    to promote nigrostriatal neurons for degen-eration (94).

    The amyloid precursor protein (APP)can be cleaved into either of two smaller

    www.annualreviews.org Gene Duplication 27

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    12/21

    peptides, A40 and A42 (38). It is accumu-

    lation of the latter causing parenchymal andvascular deposition that is probably responsi-

    ble for Alzheimer pathogenesis. The diseasephenotype can be caused by a varietyof mech-

    anisms that lead to an increase in APP pro-tein aggregates, namely duplications ofAPP

    on chromosome 21 (82), Down syndrome (tri-somy 21) with three APPcopies, or, alterna-tively, mutations inAPPitself or in the gene

    encoding the proteases responsible for APPcleavage (38). Similar mechanisms may ac-

    count for other neurological diseases that arebased on protein deposition (38).

    Modified responses to drugs. Genes en-

    coding enzymes are not usually dosage sen-sitive, yet there are examples of such gene

    duplications that cause variable pheno-types, particularly those encoding protein

    complexes that regulate drug metabolism

    (12). The three cytochrome P-450 isoformsCYP2C9, CYP2C19, and CYP2D6, which all

    vary in copy number, are responsible for thebiotransformation of about 40% of all drugs

    that are metabolized by P-450. Copy num-ber variation among the 80 distinct alleles

    of theCYP2D6gene partially defines the in-dividual capacity to metabolize drugs. Four

    major phenotypic categories of drug oxida-tion have been recognized to date, namely

    poor, intermediate, extensive, and ultrarapidmetabolizers. Because these phenotypic dif-

    ferences have clinical consequences, such as

    adverse drug reactions or therapeutic failure(77), it will be important to develop accurately

    predictive genotyping for these and other en-zymes that appear to display clinically relevant

    CNVs (41). CNVs are currently under inves-tigation for their pharmacogenetic relevance

    (73).

    Altered responses to pathogens. Becausegenes encoding proteins in metabolic path-

    ways in unicellular organisms have a higherduplicability rate than other gene categories

    (61), this could generally be true for genes

    responding to and/or regulated by environ-

    mental stimuli. Intriguingly, copy number

    variation of the gene encoding CCL3L1a potent human immunodeficiency virus-1

    (HIV-1)-suppressive chemokine, influencesthe susceptibilityto HIV-1 infection (34).One

    could speculate that CNVs containing genesrelated to immune responses may be involved

    in the control of infections. Of relevance inthis regard are CNVs of the novel defensin

    gene family that function as chemotactic

    activators of the immune response (3, 86). Asdiscussed below, defensin CNVs may cause

    inflammatory bowel disease by modulatingthe host response to pathogens (29). Gene

    families that constitute the core elementscontrolling infections, namely the T-cel

    receptor and immunoglobulin genes, evolvedvia species-specific sequential rounds of dupli-

    cations (98). Copy number variation in thesegene families may also play a role in complex

    phenotypes such as autoimmune diseases.

    Role of copy number variants in complex

    diseases. An example of the possible roleof CNVs in common complex phenotypes

    is the demonstration that low copy numbersof theFCGR3Bgene encoding the activatory

    Fc receptor for IgG predispose patients withsystemic lupus erythematosus to an inflamma-

    tory disease of the kidneys (glomerulonephri-tis) (1). Analogous findings in rats corroborate

    this observation. Additional examples includeduplication ofTLR7, an innate immune re-

    ceptor that predisposes mice carrying it to au-toreactive B-cell responses (78), and low copy

    number of the human beta-defensin 2 (hBD2)

    gene that predisposes to Crohns disease ofthe colon (29). The importance of CNVs for

    complex phenotypes and diseases in genes re-lated to immune defense should therefore be

    generally explored.

    Disease-causing gene conversion medi-

    ated by duplicated pseudogenes. Among

    the 1945 nonprocessed pseudogenes locatednear the ancestral gene (out of a total of

    3426 nonprocessed pseudogenes), 11 casesof deleterious gene conversion induced by

    28 Conrad Antonarakis

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    13/21

    the pseudogene were recognized (11). Among

    those is the retinitis pigmentosa9 pseudogene(RP9P) carrying a mutation that produces

    a nonsynonymous substitution in the wild-type gene associated with the RP9 form

    of autosomal dominant retinitis pigmentosa(ADRP). Two other processed pseudogenes

    also contain mutations associated with dis-eases, an inosine monophosphate dehydroge-nase 1 pseudogene (IMPDH1P1) that causes

    the RP10 form of ADRP, and a phosphoglyc-erate kinase 1 pseudogene (PGK1P1) associ-

    ated with phosphoglycerate kinase deficiency.These observations highlight the pathogenic

    potential of duplicate gene copies that ac-quired inactivating mutations (pseudogenes)

    for the wild-type progenitor gene locatednearby.

    Partial phenotype rescue by a duplicated

    gene. The second most frequent autosomalrecessive disease in Europeans, spinal muscu-

    lar atrophy (SMA), adds an important lessonto the phenotypic consequences of gene du-

    plication. The disease, which is usually due

    to homozygous mutations in the survival mo-tor neuron 1 gene (SMN1), can be partially

    rescued by increasing copy numbers of itsnearly identical and partially functional dupli-

    cate SMN2 (28). This example emphasizesthepartial functional redundancy of some dupli-

    cate genes (40), and supports the notion thatdeletion of a duplicate gene results in a less

    severe phenotype than deletion of a singletongene.

    CONCLUSIONS ANDPERSPECTIVES

    The frequency and variety of phenotypes in-

    duced by gene duplication are diverse, andhave not yet been fully appreciated. Several

    fields deserve more investigation, notably therole of gene duplication in (a) monogenic

    phenotypes, [particularly disorders implicat-ing haploinsufficient dosage-sensitive genes,

    for instance congenital heart disease (71)];(b) polygenic complex multifactorial pheno-

    types, as exemplified by the implication of the

    FCGR3and hBD2copy number in glomeru-

    lonephritis and Crohns disease, and by therole ofTLR7duplications in autoimmunity

    elicited by autoantibodies; (c) drug, host,

    pathogen, and metabolic responses; (d) dis-eases due to protein aggregation; and (e) gene

    regulation, as exemplified by the contributionof microRNA gene duplication to complex

    patterns of gene regulation (59), and by du-plication of conserved noncoding regions that

    helped diversify expression of developmen-tal regulators (63). Finally, because increased

    dosage of genes related to olfaction, immu-nity, and protein secretion may have been

    positively selected in humans (69), a possiblymoregeneral contribution to phenotypicvari-

    ation of CNVs in genes responding to envi-

    ronmental stimuli deserves a more thoroughexploration in the near future. In summary,

    this review article discusses and emphasizesthat gene duplication plays an important role

    in genome evolution and phenotypic variabil-ity, and can cause disease phenotypes.

    ACKNOWLEDGMENTS

    We thank Drs. Jacques Beckmann, Samuel Deutsch, and Henrik Kaessmann for reading the

    manuscript. B.C. is supported by the SNSF and Helmut Horten Foundation; S.E.A. is sup-

    ported by the SNSF, EU, NIH, and Childcare Foundation, and by funds from the Universityof Geneva.

    LITERATURE CITED

    1. Aitman TJ, Dong R, Vyse TJ, Norsworthy PJ, Johnson MD, et al. 2006. Copy numberpolymorphism in Fcgr3 predisposes to glomerulonephritis in rats and humans.Nature

    439:85155

    www.annualreviews.org Gene Duplication 29

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    14/21

    2. Akahoshi K, Ohashi H, Hattori Y, Saitoh S, Fukushima Y, Wada T. 2005. A woman

    with 46,XX,dup(16)(p13.11 p13.3) and the ATR-X phenotype. Am. J. Med. Genet. A132:41418

    3. Aldred PM, Hollox EJ, Armour JA. 2005. Copy number polymorphism and expressionlevel variation of the human alpha-defensin genes DEFA1 and DEFA3.Hum. Mol. Genet14:204552

    4. ArmengolL,PujanaMA,CheungJ,SchererSW,EstivillX.2003.Enrichmentofsegmen-

    tal duplications in regions of breaks of synteny between the human and mouse genomessuggest their involvement in evolutionary rearrangements. Hum. Mol. Genet.12:220185. Bailey JA, Baertsch R, Kent WJ, Haussler D, Eichler EE. 2004. Hotspots of mammalian

    chromosomal evolution.Genome Biol.5:R236. Bailey JA, Gu Z, Clark RA, Reinert K, Samonte RV, et al. 2002. Recent segmental

    duplications in the human genome.Science297:100377. Bailey JA, Liu G, Eichler EE. 2003. An Alu transposition model for the origin and

    expansion of human segmental duplications.Am. J. Hum. Genet.73:823348. Berg J, Lassig M, Wagner A. 2004. Structure and evolution of protein interaction net-

    works: a statistical model for link dynamics and gene duplications.BMC Evol. Biol.4:519. Birchler JA, Bhadra U, Bhadra MP, Auger DL. 2001. Dosage-dependent gene regulation

    in multicellular eukaryotes: implications for dosage compensation, aneuploid syndromesand quantitative traits.Dev. Biol.234:27588

    10. Birchler JA, Riddle NC, Auger DL, Veitia RA. 2005. Dosage balance in gene regulation

    biological implications.Trends Genet.21:2192611. Bischof JM, Chiang AP, Scheetz TE, Stone EM, Casavant TL, et al. 2006. Genome-wide

    identification of pseudogenes capable of disease-causing gene conversion. Hum. Mutat27:54552

    12. Caraco Y. 2004. Genes and the response to drugs.N. Engl. J. Med.351:28676913. Casneuf T, De Bodt S, Raes J, Maere S, Van de Peer Y. 2006. Nonrandom divergence of

    gene expressionfollowing gene andgenomeduplicationsin theflowering plantArabidopsis

    thaliana.Genome Biol.7:R13

    14. Chen CP, Lin SP, Lin CC, Chen YJ, Chern SR, et al. 2006. Molecular cytogeneticanalysis of de novo dup(5)(q35.2q35.3) and review of the literature of pure partial trisomy

    5q.Am. J. Med. Genet. A140:159460015. Cheng Z, Ventura M, She X, Khaitovich P, Graves T, et al. 2005. A genome-wide com-

    parison of recent chimpanzee and human segmental duplications. Nature437:8893

    16. Cheung VG, Nowak N, Jang W, Kirsch IR, Zhao S, et al. 2001. Integration of cytogeneticlandmarks into the draft sequence of the human genome.Nature409:95358

    17. Chung WY, Albert R, Albert I, Nekrutenko A, Makova KD. 2006. Rapid and asym-metric divergence of duplicate genes in the human gene coexpression network. BMC

    Bioinformatics7:4618. Conant GC, Wagner A. 2002. GenomeHistory: a software tool and its application to fully

    sequenced genomes.Nucleic Acids Res.30:33788619. Davis JC, Petrov DA. 2004. Preferential duplication of conserved proteins in eukaryotic

    genomes.PLOS Biol.2:E5520. Davis JC, Petrov DA. 2005. Do disparate mechanisms of duplication add similar genes

    to the genome?Trends Genet.21:5485121. de Mollerat X, Gurrieri F, Morgan CT, Sangiorgi E, Everman DB, et al. 2003. A genomic

    rearrangement resulting in a tandem duplication is associated with split hand-split foot

    malformation 3 (SHFM3) at 10q24. Hum. Mol. Genet.12:195971

    30 Conrad Antonarakis

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    15/21

    22. Dehal P, Boore JL. 2005. Two rounds of whole genome duplication in the ancestral

    vertebrate.PLOS Biol.3:e31423. Dermuth JP, De Bie T, Stjich JE, Cristianini N, Hahn MW. 2006.PLoS1:e85

    24. Derti A, Roth FP, Church GM, Wu CT. 2006. Mammalian ultraconserved elements arestrongly depleted among segmental duplications and copy number variants.Nat. Genet.38:121620

    25. Deutschbauer AM, Jaramillo DF, Proctor M, Kumm J, Hillenmeyer ME, et al. 2005.

    Mechanisms of haploinsufficiency revealed by genome-wide profiling in yeast.Genetics169:19152526. Eichler EE. 2001. Recent duplication, domain accretion and the dynamic mutation of

    the human genome.Trends Genet.17:6616927. Ellis D, Malcolm S. 1994. Proteolipid proteingene dosageeffect in Pelizaeus-Merzbacher

    disease.Nat. Genet.6:3333428. Feldkotter M, Schwarzer V, Wirth R, Wienker TF, Wirth B. 2002. Quantitative analyses

    of SMN1 and SMN2 based on real-time lightCycler PCR: fast and highly reliable carriertesting and prediction of severity of spinal muscular atrophy.Am. J. Hum. Genet.70:358

    6829. Fellermann K, Stange DE, Schaeffeler E, Schmalzl H, WehkampJ, et al. 2006. A chromo-

    some 8 gene-cluster polymorphism with low human Beta-defensin 2 gene copy numberpredisposes to crohn disease of the colon. Am. J. Hum. Genet.79:43948

    30. Force A, Lynch M, Pickett FB, Amores A, Yan YL, Postlethwait J. 1999. Preservation of

    duplicate genes by complementary, degenerative mutations. Genetics151:15314531. Freeling M, Thomas BC. 2006. Gene-balanced duplications, like tetraploidy, provide

    predictable drive to increase morphological complexity.Genome Res.16:8051432. Gajecka M, Yu W, Ballif BC, Glotzbach CD, Bailey KA, et al. 2005. Delineation of

    mechanisms and regions of dosage imbalance in complex rearrangements of 1p36 leadsto a putative gene for regulation of cranial suture closure.Eur. J. Hum. Genet.13:13949

    33. Gimelbrant AA, Chess A. 2006. An epigenetic state associated with areas of gene dupli-cation.Genome Res.16:72329

    34. Gonzalez E, Kulkarni H, Bolivar H, Mangano A, Sanchez R, et al. 2005. The influence ofCCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility.Science307:143440

    35. Gu X, Wang Y, Gu J. 2002. Age distribution of human gene families shows significant

    roles of both large- and small-scale duplications in vertebrate evolution. Nat. Genet.

    31:205936. Gu X, Zhang Z, Huang W. 2005. Rapid evolution of expression and regulatory diver-

    gences after yeast gene duplication.Proc. Natl. Acad. Sci. USA102:7071237. Gu Z, Rifkin SA, White KP, Li WH. 2004. Duplicate genes increase gene expression

    diversity within and between species.Nat. Genet.36:5777938. Hardy J. 2006. Amyloid double trouble. Nat. Genet.38:1112

    39. He X, Zhang J. 2005. Gene complexity and gene duplicability. Curr. Biol.15:10162140. He X, Zhang J. 2005. Transcriptional reprogramming and backup between duplicate

    genes: Is it a genome-wide phenomenon? Genetics172:13636741. Hildebrandt MA, Salavaggione OE, Martin YN, Flynn HC, Jalal S, et al. 2004. Human

    SULT1A3 pharmacogenetics: gene duplication and functional genomic studies.Biochem.

    Biophys. Res. Commun.321:87078

    42. Huminiecki L, Wolfe KH. 2004. Divergence of spatial gene expression profiles following

    species-specific gene duplications in human and mouse.Genome Res.14:187079

    www.annualreviews.org Gene Duplication 31

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    16/21

    43. Iafrate AJ, Feuk L, Rivera MN, Listewnik ML, Donahoe PK, et al. 2004. Detection of

    large-scale variation in the human genome. Nat. Genet.36:9495144. Jimenez-Sanchez G, Childs B, Valle D. 2001. Human disease genes.Nature409:8535545. Jordan IK, Wolf YI, Koonin EV. 2004. Duplicated genes evolve slower than singletons

    despite the initial rate increase. BMC Evol. Biol.4:2246. Kondrashov FA, Koonin EV. 2004. A common framework for understanding the origin of

    genetic dominance and evolutionary fates of gene duplications.Trends Genet.20:2879047. Kondrashov FA, Rogozin IB, Wolf YI, Koonin EV. 2002. Selection in the evolution of

    gene duplications.Genome Biol.3: RESEARCH000848. Kopelman NM, Lancet D, Yanai I. 2005. Alternative splicing and gene duplication are

    inversely correlated evolutionary mechanisms.Nat. Genet.37:5888949. Li J, Uversky VN, Fink AL. 2001. Effect of familial Parkinsons disease point mutations

    A30P and A53T on the structural properties, aggregation, and fibrillation of human

    alpha-synuclein.Biochemistry40:116041350. Li WH, Gu Z, Cavalcanti AR, Nekrutenko A. 2003. Detection of gene duplications and

    block duplications in eukaryotic genomes.J. Struct. Funct. Genom.3:273451. Linardopoulou EV, Williams EM,Fan Y, Friedman C, Young JM,Trask BJ.2005.Human

    subtelomeres are hot spots of interchromosomal recombination and segmental duplica-tion.Nature437:94100

    52. Luger K, Hansen JC. 2005. Nucleosome andchromatin fiber dynamics. Curr. Opin. StructBiol.15:18896

    53. Luikenhuis S, Giacometti E, Beard CF, Jaenisch R. 2004. Expression of MeCP2 in post-

    mitotic neurons rescues Rett syndrome in mice.Proc. Natl. Acad. Sci. USA101:60333854. Lupski JR, Stankiewicz P. 2005. Genomic disorders: molecular mechanisms for rear-

    rangements and conveyed phenotypes.PLOS Genet.1:e4955. Lupski JR, Wise CA, Kuwano A, Pentao L, Parke JT, et al. 1992. Gene dosage is a

    mechanism for Charcot-Marie-Tooth disease type 1A. Nat. Genet.1:293356. Lynch M, Conery JS. 2000. The evolutionary fate and consequences of duplicate genes

    Science290:11515557. Lynch M, Force A. 2000. The probability of duplicate gene preservation by subfunction-

    alization.Genetics154:4597358. Maere S, De Bodt S, Raes J, Casneuf T, Van Montagu M, et al. 2005. Modeling gene and

    genome duplications in eukaryotes.Proc. Natl. Acad. Sci. USA102:54545959. Maher C, Stein L, Ware D. 2006. Evolution ofArabidopsismicroRNA families through

    duplication events.Genome Res.16:5101960. Makova KD, Li WH. 2003. Divergence in the spatial pattern of gene expression between

    human duplicate genes.Genome Res.13:16384561. Marland E, Prachumwat A, Maltsev N, Gu Z, Li WH. 2004. Higher gene duplicabilities

    for metabolic proteins than for nonmetabolic proteins in yeast and E.coli.J. Mol. Evol59:80614

    62. Maslov S, Sneppen K, Eriksen KA, Yan KK. 2004. Upstream plasticity and downstream

    robustness in evolution of molecular networks.BMC Evol. Biol.4:963. McEwen GK, Woolfe A, Goode D, Vavouri T, Callaway H, Elgar G. 2006. Ancient du-plicated conserved noncoding elements in vertebrates: a genomic and functional analysis

    Genome Res.16:4516564. Moog U, Engelen J, Albrechts J, Hoorntje T, Hendrikse F, Schrander-Stumpel C. 1996

    Alagille syndrome in a family with duplication 20p11.Clin. Dysmorphol.5:2798865. Murakami T, Garcia CA, Reiter LT, Lupski JR. 1996. Charcot-Marie-Tooth disease and

    related inherited neuropathies.Medicine (Baltimore)75:23350

    32 Conrad Antonarakis

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    17/21

    66. Murphy WJ, Larkin DM, Everts-van der Wind A, Bourque G, Tesler G, et al. 2005.

    Dynamics of mammalian chromosome evolution inferred from multispecies comparativemaps.Science309:61317

    67. Nei M, Rogozin IB, Piontkivska H. 2000. Purifying selection and birth-and-death evo-

    lution in the ubiquitin gene family.Proc. Natl. Acad. Sci. USA97:108667168. NeiM, RooneyAP. 2005. Concertedandbirth-and-death evolution of multigenefamilies.

    Annu. Rev. Genet. 39:1215269. Nguyen DQ, Webber C, Ponting CP. 2006. Bias of selection on human copy-number

    variants.PLOS Genet.2:e2070. Ohno S. 1970.Evolution by Gene Duplication. New York: Springer Verlag71. Olson EN. 2006. Gene regulatory networks in the evolution and development of the

    heart.Science313:19222772. Olson MV. 1999. When less is more: gene loss as an engine of evolutionary change. Am.

    J. Hum. Genet. 64:182373. Ouahchi K, Lindeman N, Lee C. 2006. Copy number variants and pharmacogenomics.

    Pharmacogenomics7:252974. Ounap K, Ilus T, Laidre P, Uibo O, Tammur P, Bartsch O. 2005. A new case of 2q

    duplication supports either a locus for orofacial clefting between markers D2S1897 andD2S2023 or a locus for cleft palate only on chromosome 2q13-q21.Am. J. Med. Genet.

    A137:3232775. Padiath QS, Saigoh K, Schiffmann R, Asahara H, Yamada T, et al. 2006. Lamin B1

    duplications cause autosomal dominant leukodystrophy. Nat. Genet.38:11142376. Papp B, Pal C, Hurst LD. 2003. Dosage sensitivity and the evolution of gene families in

    yeast.Nature424:1949777. Phillips KA, Veenstra DL, Oren E, Lee JK, Sadee W. 2001. Potential role of pharma-

    cogenomics in reducing adverse drug reactions: a systematic review. JAMA286:22707978. Pisitkun P, Deane JA, Difilippantonio MJ, Tarasenko T, Satterthwaite AB, Bolland S.

    2006. Autoreactive B cell responses to RNA-related antigens due to TLR7 gene dupli-cation.Science312:166972

    79. Potocki L, Chen KS, Park SS, Osterholm DE, Withers MA, et al. 2000. Molecular

    mechanism for duplication 17p11.2the homologous recombination reciprocal of theSmith-Magenis microdeletion.Nat. Genet.24:8487

    80. Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, et al. 2006. Gloval variation in copynumber in the human genome.Nature444:44454

    81. Roper RJ, Reeves RH. 2006. Understanding the basis for Down syndrome phenotypes.

    PLOS Genet.2:e5082. Rovelet-Lecrux A, Hannequin D, Raux G, Meur NL, Laquerriere A, et al. 2006. APP

    locus duplication causes autosomal dominant early-onset Alzheimer disease with cerebralamyloid angiopathy.Nat. Genet.38:2426

    83. Samonte RV, Eichler EE. 2002. Segmental duplications and the evolution of the primate

    genome.Nat. Rev. Genet.3:6572

    84. Schedl A, Ross A, Lee M, Engelkamp D, Rashbass P, et al. 1996. Influence of PAX6 genedosage on development: overexpression causes severe eye abnormalities. Cell86:718285. Sebat J, Lakshmi B, Troge J, Alexander J, Young J, et al. 2004. Large-scale copy number

    polymorphism in the human genome.Science305:5252886. Semple CA, Gautier P, Taylor K, Dorin JR. 2006. The changing of the guard: Molecular

    diversity and rapid evolution of beta-defensins. Mol. Divers. 10:5758487. Sen SK, Han K, Wang J, Lee J, Wang H, et al. 2006. Human genomic deletions mediated

    by recombination between Alu elements.Am. J. Hum. Genet.79:4153

    www.annualreviews.org Gene Duplication 33

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    18/21

    88. Sharp AJ, Locke DP, McGrath SD, Cheng Z, Bailey JA, et al. 2005. Segmental duplica-

    tions and copy-number variation in the human genome. Am. J. Hum. Genet.77:788889. Shaw CJ, Lupski JR. 2004. Implications of human genome architecture for

    rearrangement-based disorders: the genomic basis of disease. Hum. Mol. Genet.13 SpecNo 1:R57-64

    90. She X, Horvath JE, Jiang Z, Liu G, Furey TS, et al. 2004. The structure and evolutionof centromeric transition regions within the human genome. Nature430:85764

    91. Shiu SH, Byrnes JK, Pan R, Zhang P, Li WH. 2006. Role of positive selection in theretention of duplicate genes in mammalian genomes.Proc. Natl. Acad. Sci. USA 103:223236

    92. Shoja V, Zhang L. 2006. A roadmap of tandemly arrayed genes in the genomes of humanmouse, and rat.Mol. Biol. Evol.23:213441

    93. Singleton A, Gwinn-Hardy K. 2004. Parkinsons disease and dementia with Lewy bodiesa difference in dose?Lancet364:11057

    94. Singleton AB. 2005. Altered alpha-synuclein homeostasis causing Parkinsons disease: thepotential roles of dardarin.Trends Neurosci.28:41621

    95. SomervilleMJ,MervisCB,YoungEJ,SeoEJ,delCampoM,etal.2005.Severeexpressive-language delay related to duplication of the Williams-Beuren locus. N. Engl. J. Med.

    353:169470196. Sopko R, Huang D, Preston N, Chua G, Papp B, et al. 2006. Mapping pathways and

    phenotypes by systematic gene overexpression.Mol. Cell21:31930

    97. Stankiewicz P, Lupski JR. 2002. Genome architecture, rearrangements and genomicdisorders.Trends Genet.18:7482

    98. Su C, Nei M. 2001. Evolutionary dynamics of the T-cell receptor VB gene family asinferred from the human and mouse genomic sequences. Mol. Biol. Evol.18:50313

    99. Su Z, Wang J, Yu J, Huang X, Gu X. 2006. Evolution of alternative splicing after geneduplication.Genome Res.16:18289

    100. Van de Peer Y. 2004. Computational approaches to unveiling ancient genome duplica-tions.Nat. Rev. Genet.5:75263

    101. Van Esch H, Bauters M, Ignatius J, Jansen M, Raynaud M, et al. 2005. Duplicationof the MECP2 region is a frequent cause of severe mental retardation and progressive

    neurological symptoms in males.Am. J. Hum. Genet.77:44253102. Veitia RA. 2002. Exploring the etiology of haploinsufficiency.Bioessays24:17584

    103. Vinckenbosch N, Dupanloup I, Kaessmann H. 2006. Evolutionary fate of retroposed

    gene copies in the human genome. Proc. Natl. Acad. Sci. USA103:322025104. Wagner A. 2000. Decoupled evolution of coding region and mRNA expression patterns

    after gene duplication: implications for the neutralist-selectionist debate.Proc. Natl. AcadSci. USA97:657984

    105. Wang X, Grus WE, Zhang J. 2006. Gene losses during human rrigins.PLOS Biol.4:e52106. Wong P, Fritz A, Frishman D. 2005. Designability, aggregation propensity and duplica-

    tion of disease-associated proteins.Protein Eng. Des. Sel.18:5038107. Woods KS, Cundall M, Turton J, Rizotti K, Mehta A, et al. 2005. Over- and underdosage

    of SOX3 is associated with infundibular hypoplasia and hypopituitarism.Am. J. Hum

    Genet.76:83349

    108. Xue Y, Daly A, Yngvadottir B, Liu M, Coop G, et al. 2006. Spread of an inactive form ocaspase-12 in humans is due to recent positive selection. Am. J. Hum. Genet.78:65970

    109. Yobb TM, Somerville MJ, Willatt L, Firth HV, Harrison K, et al. 2005. Microduplication

    and triplication of 22q11.2: a highly variable syndrome. Am. J. Hum. Genet.76:86576

    34 Conrad Antonarakis

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    19/21

    110. Zhang L, Lu HH, Chung WY, Yang J, Li WH. 2005. Patterns of segmental duplication

    in the human genome.Mol. Biol. Evol.22:13541111. Zhang Z, Gu J, Gu X. 2004. Howmuch expression divergence after yeast gene duplication

    could be explained by regulatory motif evolution?Trends Genet.20:4037112. Zhang Z, Luo ZW, Kishino H, Kearsey MJ. 2005. Divergence pattern of duplicate genes

    in protein-protein interactions follows the power law.Mol. Biol. Evol.22:5015113. Zody MC, Garber M, Sharpe T, Young SK, Rowen L, et al. 2006. Analysis of the DNA

    sequence and duplication history of human chromosome 15. Nature440:67175114. Zody MC, Garber M, Adams DJ, Sharpe T, Harrow J, et al. 2006. DNA sequence ofhuman chromosome 17 and analysis of rearrangement in the human lineage. Nature

    440:104549

    www.annualreviews.org Gene Duplication 35

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    20/21

    Annual Review

    Genomics and

    Human Geneti

    Volume 8, 2007Contents

    Human Evolution and Its Relevance for Genetic Epidemiology

    Luigi Luca Cavalli-Sforza 1

    Gene Duplication: A Drive for Phenotypic Diversity and Cause of

    Human Disease

    Bernard Conrad and Stylianos E. Antonarakis

    17

    DNA Strand Break Repair and Human Genetic Disease

    Peter J. McKinnon and Keith W. Caldecott 37

    The Genetic Lexicon of Dyslexia

    Silvia Paracchini, Thomas Scerri, and Anthony P. Monaco 57

    Applications of RNA Interference in Mammalian Systems

    Scott E. Martin and Natasha J. Caplen 81

    The Pathophysiology of Fragile X Syndrome

    Olga Penagarikano, Jennifer G. Mulle, and Stephen T. Warren

    109Mapping, Fine Mapping, and Molecular Dissection of Quantitative

    Trait Loci in Domestic Animals

    Michel Georges 131

    Host Genetics of Mycobacterial Diseases in Mice and Men:

    Forward Genetic Studies of BCG-osis and Tuberculosis

    A. Fortin, L. Abel, J.L. Casanova, and P. Gros 163

    Computation and Analysis of Genomic Multi-Sequence Alignments

    Mathieu Blanchette 193

    microRNAs in Vertebrate Physiology and Human Disease

    Tsung-Cheng Chang and Joshua T. Mendell 215

    Repetitive Sequences in Complex Genomes: Structure and Evolution

    Jerzy Jurka, Vladimir V. Kapitonov, Oleksiy Kohany, and Michael V. Jurka 241

    Congenital Disorders of Glycosylation: A Rapidly Expanding Disease Family

    Jaak Jaeken and Gert Matthijs 261

    v

  • 8/13/2019 Conrad Et Al 2007 Gene Duplication

    21/21

    Annotating Noncoding RNA Genes

    Sam Griffiths-Jones

    Using Genomics to Study How Chromatin Influences Gene Expression

    Douglas R. Higgs, Douglas Vernimmen, Jim Hughes, and Richard Gibbons

    Multistage Sampling for Genetic Studies

    Robert C. Elston, Danyu Lin, and Gang Zheng

    The Uneasy Ethical and Legal Underpinnings of Large-Scale

    Genomic Biobanks

    Henry T. Greely

    Indexes

    Cumulative Index of Contributing Authors, Volumes 18

    Cumulative Index of Chapter Titles, Volumes 18

    Errata

    An online log of corrections toAnnual Review of Genomics and Human Genetics

    chapters may be found at http://genom.annualreviews.org/