Transcript
Page 1: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

Supplementary Information:

Frequent mutations of genes encoding ubiquitin-mediated proteolysis

pathway components in clear cell renal cell carcinoma

Guangwu Guo1,10, Yaoting Gui2,10, Shengjie Gao1,10, Aifa Tang2,3,10, Xueda Hu1,10, Yi

Huang2,3,10, Wenlong Jia1, Zesong Li2,3, Minghui He1, Liang Sun2, Pengfei Song1,

Xiaojuan Sun3, Xiaokun Zhao4, Sangming Yang1, Chaozhao Liang5, Shengqing Wan1,

Fangjian Zhou6, Chao Chen1, Jialou Zhu1,7, Xianxin Li2, Minghan Jian1, Liang Zhou2,

Rui Ye1, Peide Huang1, Jing Chen2, Xiao Liu1, Yong Wang2, Jing Zou1, Zhimao Jiang2,

RenHua Wu1, Song Wu2, Fan Fan1, Zhongfu Zhang2, Lin Liu1, Ruilin Yang2, Xingwang

Liu1, Haibo Wu1, Weihua Yin2, Xia Zhao1, Yuchen Liu2, Huanhuan Peng1, Binghua Jiang2,

Qingxin Feng2, Cailing Li2, Jun Xie2, Jingxiao Lu2, Karsten Kristiansen1,8, Yingrui Li1,

Xiuqing Zhang1, Songgang Li1, Jian Wang1, Huanming Yang1, Zhiming Cai2,3 & Jun

Wang1,8,9

1Shenzhen Key Laboratory of Transomics Biotechnologies, BGI-Shenzhen, Shenzhen 518083,

China. 2Guangdong and Shenzhen Key Laboratory of Male Reproductive Medicine and Genetics,

Institute of Urology, Peking University Shenzhen Hospital, Shenzhen PKU-HKUST Medical

Center, Shenzhen 518036, China.

3Shenzhen Second People's Hospital, the First Affiliated Hospital of Shenzhen University,

Shenzhen 518035, China. 4Department of Urology, the Second Xiangya Hospital of Central-Southern University, Changsha

410011, China.

5Department of Urology, the First Affiliated Hospital of Anhui Medical University, Hefei 230022,

China. 6Department of Urology, Sun Yat-Sen University Cancer Center, Guangzhou 510060, China. 7College of Life Science, Wuhan University, Wuhan, 430072, China.

8Department of Biology, University of Copenhagen, DK-1165 Copenhagen, Denmark. 9The Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen,

2200 Copenhagen, Denmark.

Nature Genetics: doi:10.1038/ng.1014

Page 2: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

10These authors contributed equally to this work.

Correspondence should be addressed to Ju.W. ([email protected]), Z.C.

([email protected]) and H.Y. ([email protected])

Nature Genetics: doi:10.1038/ng.1014

Page 3: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

Supplementary Methods

Sample description and preparation

Tumors with matched normal controls (morphologically adjacent normal kidney tissues

cut at least 5 cm away from the boundaries of the primary tumors) were obtained from

patients with clear cell renal cell carcinoma (ccRCC) newly diagnosed at member

institutions of Urinogenital Cancer Genomics Consortium (UCGC) in China. A signed

written consent from each patient was obtained before the recruitment in the study

according to the regulations of the institutional ethics review boards. Detailed clinical

information on the patients is summarized in Supplementary Table 1. All the specimens

were snap-frozen in liquid nitrogen upon collection and immediately stored at -80℃ for

further study. The hematoxylin-eosin (HE)-stained sections prepared using the cancerous

or apparently normal tissues were microscopically evaluated by two independent

pathologists. Typically, the tumor cells with clear transparent cytoplasm and well-defined

cell membrane were interspersed within the highly vascularized stroma. The normal

control samples were characterized by the presence of normal renal tubules and

glomeruluses under each microscopic field, without any notable traces of tumor cell

contamination. In this study, only ccRCCs with malignant cell purities over 85% were

selected for DNA extraction and subsequent sequencing.

Genomic DNA extraction and whole-exome sequencing

In the Discovery Screen, genomic DNAs of tumor and matched normal samples from 10

ccRCC patients were isolated using QIAamp DNA Mini Kits (QIAGEN, Hilden,

Germany) according to the protocol provided by the manufacturer. Genomic DNAs were

then fragmented and hybridized to NimbleGen 2.1M Human Exome Arrays (Roche

NimbleGen, Inc, USA), which were capable of enriching the exonic sequences of more

than 18, 000 protein-coding genes deposited in the highly curated database of Consensus

Coding Sequence Region (http://www.ncbi.nlm.nih.gov/projects/CCDS).

In brief, all the extracted genomic DNAs were randomly sonicated to a smear of 300

bp ~ 800 bp for polishing with T4 DNA polymerase and T4 polynucleotide kinase.

Nature Genetics: doi:10.1038/ng.1014

Page 4: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

NimbleGen linkers were added to the polished DNA fragments by T4 DNA ligase. The

ligated products were then hybridized to the capture array according to the manufacturer’s

protocol, and the enriched DNA fragments were eluted and amplified by ligation-mediated

PCR through the linkers added to exonic DNA fragments. Before the second run of

library construction, qPCR reactions were performed to estimate the enrichment rate of

exonic sequences. A total of four targeted exons were selected for this evaluation. The

minimum requirement of 80-fold enrichment was achieved for all libraries prepared for

the next procedure. The enriched exonic DNA was randomly ligated with blunt-ends by

DNA ligase to fragments ranging from 2 kb to 5 kb in size. The resulting DNA products

were sheared to 200 bps on average and were subjected to standard Illumina Genome

Analyzer (GA) library preparation according to Illumina’s protocol. The exome-enriched

shotgun libraries were sequenced with the Illumina GA Ⅱ platform and single-end reads

with average size of 80 bps were generated. Image analysis and base calling was performed

by the Genome Analyzer Pipeline version 1.3 with default parameters.

Illumina-based exon resequencing of selected genes

In the Prevalence Screen, we determined the mutation frequencies of selected genes

(Supplementary Table 4) in 88 additional ccRCC patients by Illumina-based exonic

resequencing. These selected genes included: 1) 234 genes that had at least one non-silent

somatic mutation in the discovery stage; 2) 413 genes that have been causally implicated

in human cancers (Cancer Gene Census, http://www.sanger.ac.uk/genetics/CGP/Census/); 3)

367 genes previously reported to harbor mutations in ccRCC (Database of COSMIC1) ;

and 4) 113 genes in the ubiquitin-mediated proteolysis pathway (135 genes have been

annotated in this pathway, of which 22 showed somatic mutations in above three

categories). All the exonic regions of these 1127 genes were submitted to NimbleGen for

the manufacturing of the targeted capture arrays. Genomic DNA from tumors and

matched normal samples was then sonicated and hybridized to the arrays followed by the

standard Illumina-based resequencing procedures as described above.

Reads mapping and detection of somatic mutations

After removing reads containing sequencing adaptors and low-quality reads with more

Nature Genetics: doi:10.1038/ng.1014

Page 5: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

than five unknown bases, the high quality reads were aligned to the NCBI human

reference genome (hg18) using MAQ2 with the default options. To identify indels, the

high quality reads were gapped aligned to the reference sequence using BWA3. Then, we

performed local realignment of the BWA aligned reads using the Genome Analysis

Toolkit (GATK)4.

The raw lists of potential somatic substitutions were called by VarScan5 (v2.2) based

on the MAQ alignments. In this process, several heuristic rules were applied: (i) both the

tumors and matched normal samples should be covered sufficiently (≥ 10×) at the

genomic position being compared; (ii) the average base quality for a given genomic

position should be at least 15 in both the tumors and normal samples; (iii) the variants

should be supported by at least 10% of the total reads in the tumors while no high quality

variant-supporting reads are allowed in normal controls; (iv) the variants should be

supported by at least five reads in the tumors. Using the same criteria, the preliminary

lists of somatic indels was called out by GATK based on the local realignment results.

After these two steps, germline variants could be effectively removed.

To further reduce the false positive calls, variations including single nucleotide

variants (SNVs) and indels were called with the SAMtools software package in the

tumors. We eliminated all somatic variants that fulfill any one of the following filtering

criterion: (i) variants with Phred-like scaled consensus scores or SNP qualities < 20; (ii)

variants with mapping qualities < 30; (iii) indels represented by only one DNA strand; (iv)

substitutions located 30bp around predicted indels. To deal with false positives associated

with pseudo gene issues or repeat sequences, simulated reads (80bp in length) containing

the potential mutations were generated and aligned to the reference genome. For a given

variants, if more than 10% of the simulated variant-containing reads could not be

uniquely mapped to the reference genome, this variant would be discarded.

In order to eliminate any previously described germline variants, the somatic

mutations were cross-referenced against the dbSNP (version 130) and SNP data sets of

Han Chinese in Beijing (CHB) and Japanese in Toyko (JPT) from the three pilot studies

in the 1000 genomes project (http://www.1000genomes.org). Any mutations present in

above data sets were filtered out and the remaining mutations were subjected to

subsequent analyses.

Nature Genetics: doi:10.1038/ng.1014

Page 6: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

Annotation of somatic mutations

We used SIFT6 and PolyPhen7 to evaluate the potential impact on protein function for

167 somatic missense mutations. SIFT, which searches against the human SWALL

database, was used to predict the functional changes based on sequence homology and

physical properties of amino acids. PolyPhen identifies homologues via BLAST search in

the NR database and uses sequence, phylogenetic, and structural information to evaluate

the potential impact on protein function. In our analysis, SIFT predicted 56 missense

mutations as deleterious and PolyPhen predicted 96 missense mutations as

probably/possibly damaging. Taken together, these two programs identified 107 missense

mutations as having potential functional relevance.

Validation of somatic mutations by mass spectrum or Sanger sequencing

Validation of the non-silent somatic substitutions by mass spectrum was performed with

the MassArray platform of Sequenom (San Diego, CA, USA) by determining their

genotypes in the tumors and matched normal samples. The genotyping assay and

base-calling procedures were performed as previously described8. We considered the

genotyping assay to be failed if the Sequenom software was unable to design primers for

PCR amplification or base extension at the primer design stage, or if the observed peak

for a given assay was not significant enough for a confident call at the base-calling stage.

To validate somatic indels using Sanger sequencing, PCR primers designed for the

putative somatic variants were initially used to amplify the source DNA from the tumors.

If the mutations were successfully confirmed in the tumors, the same primer pairs were

used to amplify the normal DNA from the same patients to determine the somatic statuses

of the observed mutations.

Statistical analysis of the significantly mutated genes

The background mutation rate was estimated based on the number of the synonymous

mutations identified in the Discovery Screen, and defined as the product of the

synonymous mutation rate and the ratio of nonsynonymous to synonymous (1.4)

observed in the HapMap database. Briefly, the synonymous somatic mutations in the

Discovery Screen were classified into 7 different categories according to their sequence

Nature Genetics: doi:10.1038/ng.1014

Page 7: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

contexts and mutation types. For each mutation category, i, let the observed number of

mutations for this category be mi and the total number of successfully sequenced

nucleotides (≥ 10×) for this category in the ten tumors be ni; the background mutation

rate for this category, bi, was calculated as 1.4× ii nm . The estimated background mutation

rates for each category were listed in Supplementary Methods Table 1. To test whether

the non-silent mutation rate of a given gene was significantly higher than the background,

the confirmed mutation data for the gene obtained from the Discovery and Prevalence

Screens were combined. Then, we estimated the passenger probability for each gene in

turn as described by Sjoblom, T. et al.9. To be specific, the probability (pgi) to obtain the

observed number of mutations of each category (i) in gene g was estimated from a

binominal distribution with bi as the success probability. The number of available

nucleotides for each category was the total number of sufficiently covered (≥ 10×) bases

for that particular category in all the 98 ccRCCs. The passenger probability (pg) for gene

g was calculated to be the product of the 7 category-specific probabilities, i.e. pg

= pi1

7gi

. We then determined the P value for each gene by the likelihood-ratio test as

described by Gad Getz et al.10. We considered the genes showing significantly (P < 0.05)

higher mutation rate than the background and harboring non-silent mutations in at least

three of the 98 ccRCCs as the significantly mutated genes.

Supplementary Methods Table 1. The background mutation rates of different mutation

types estimated from the Discovery Screen.

Mutation types

Number of mutations

Background mutation rates (per Mb)

A 5 0.09 C at CpG 6 1.13 C at non-CpG 14 0.29 G at CpG 5 0.94 G at non-CpG 11 0.23 T 4 0.07 Indels 45 0.21 Total 90 0.42

Nature Genetics: doi:10.1038/ng.1014

Page 8: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

Statistical analysis of the inactivating mutations in genes

To assess whether nonsense mutations were overrepresented in the significantly mutated

genes, we calculated the probability (Pn) of single base changes that would result in

nonsense mutations by chances. To this end, the coding sequence of each gene was

represented by its longest transcript and every single base in the coding region was

changed into the other three different bases. We obtained Pn for each gene by dividing the

number of nonsense mutations observed by chance in each gene by the length of the

coding region. The significance of nonsense mutation enrichment was determined by a

binomial test with parameter Pn as the hypothesized probability of success.

VHL promoter methylation analysis by bisulfite sequencing

Genomic DNA (1-2μg) was submitted to bisulfite modification using the EpiTect

Bisulfite Kit (Qiagen, Hilden, Germany) according the manufacturer’s instruction. We

obtained the VHL promoter sequences as described in previous studies11,12. The

methylation status of VHL core promoter was tested by Bisulfite Sequencing PCR (BSP).

The BSP primers (5’-AAAAAAAATATTAAATTTTAGAGGGG-3’, and

5’-CRATTACAAAAAATAACCTAAAAA AACTC-3’) were designed by MethPrimer

(http://www.urogene.org/methprimer/)13. Bisulfite-treated DNA (2μl) was amplified in a

50 μl reaction using HotStar Taq DNA polymerase (Qiagen, Hilden, Germany) with the

following PCR program: 10 min at 95°C, 40 cycles of 45 s at 94°C, 45 s at 53°C and 1

min at 72°C, followed by a final extension step at 72°C for 10 min. PCR products were

purified and ligated with pUC118-Teasy vectors. 15 white clones for each sample were

randomly selected for further Sanger sequencing. The BISMA (Bisulfite Sequencing

DNA Methylation Analysis) software14 was used to determine the methylation status for

each CpG site and present the methylation pattern. The log2 Ratio of fold change (tumor

versus normal) of methylation level in VHL promoter for each patient was calculated.

Chi-square test was performed to calculate the P-value and determine the differentially

methylated samples. The VHL promoter with a methylation level ≥ 10%, a log2Ratio ≥ 1

and a calculated P-value ≤ 0.001 was defined as hypermethylated.

Nature Genetics: doi:10.1038/ng.1014

Page 9: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

Pathway analysis

Firstly, we carried out pathway enrichment analysis using the mutational data from the

unbiased screen of the discovery cohort to see if there were any biochemical pathways

commonly altered among the ccRCC patients. Briefly, we performed the pathway

enrichment analysis using WebGestalt15 (version 2) by examining distribution of the

non-silently mutated genes identified in the Discovery Screen within the KEGG database

(http://www.genome.jp/kegg/). The significance of mutation enrichment was determined

by a hypergeometric test and was adjusted for multiple testing with Benjamini &

Hochberg false discovery rate (FDR).

We found the ubiquitin-mediated proteolysis pathway (UMPP) was ranked as the

most significantly mutated pathway in the Discovery Screen and then screened all genes

in this pathway in the Prevalence Screen. To determine whether the mutation frequency

in the UMPP was significantly greater than the background, we used a similar method as

described in determination of the significantly mutated genes. For each of 7 mutation

types, we determined the numbers of the somatic mutations and successfully sequenced

nucleotides for the genes of UMPP in the 98 ccRCCs. Then we considered the UMPP as

‘one gene’ and determined the significance of the mutation frequency in the UMPP. The

passenger mutation frequencies in Supplementary Methods Table 1 were used in this

analysis.

Immunohistochemical analysis of HIF1α and HIF2α

Samples were embedded in paraffin wax, and sectioned into 3 μm sections. Then the

sections were deparaffinized, rehydrated. After antigen retrieval for 15 min in 10 mM

citrate buffer (pH 6.0), endogenous peroxidase was blocked by treating the slides with

3% hydrogen peroxide and incubated for 15 min. Sections were incubated with 8% BSA

and incubated overnight at 4°C with a 1:150 dilution of anti-HIF1α (Sigma-Aldrich Co.

USA) primary monoclonal mouse antibody, or with a 1:5000 dilution of anti-HIF-2α (PL

Lab, China) primary rabbit antibody. After being washed in PBS, the sections were

treated with MaxVision HRP-Polymer anti-Mouse IHC Kit (Maixin Bio, Fujian, China).

Subsequently, DAB kit (Maixin Bio, Fujian, China) was used to visualize the peroxidase

Nature Genetics: doi:10.1038/ng.1014

Page 10: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

activity. The negative control procedures for the immunohistochemical staining were

performed by replacing the primary antibodies with antibody diluents.

Immunohistochemical results were evaluated by two different pathologists

according to uniform criteria pre-established. The expression of HIF1α or HIF2α was

evaluated by calculating a total immunostaining score (TIS) as the product of a

proportion score (PS) and an intensity score (IS). The PS describes the estimated fraction

of positively stained cells (0, none; 1, <10%; 2, 10–50%; 3, 51–80%; 4, >80%). The IS

represents the estimated staining intensity: 0, no staining; 1, weak (light yellow); 2,

moderate (yellow brown); 3, strong (brown). The TIS (TIS=PS × IS) ranges from 0 to 12

with only nine possible values (that is, 0, 1, 2, 3, 4, 6, 8, 9 and 12). Then, we defined the

expression levels of the two proteins into four subgroups (Supplementary Fig. 3 and

Supplementary Fig. 4): -, no expression, TIS 0; +, weak expression, TIS 1–4; ++,

moderate expression, TIS 6 and 8; and +++, intense expression, TIS 9 and 1216. We

considered HIF1α or HIF2α to be over-expressed in the tumors, compared with the

matched normal control samples, if the TIS values of the tumors were in the subgroups

with higher expression level than the matched normal samples.

Nature Genetics: doi:10.1038/ng.1014

Page 11: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

References 1. Forbes, S.A. et al. The Catalogue of Somatic Mutations in Cancer (COSMIC). Curr Protoc Hum

Genet Chapter 10, Unit 10 11 (2008). 2. Li, H., Ruan, J. & Durbin, R. Mapping short DNA sequencing reads and calling variants using

mapping quality scores. Genome Res 18, 1851-8 (2008). 3. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform.

Bioinformatics 25, 1754-60 (2009). 4. McKenna, A. et al. The Genome Analysis Toolkit: A MapReduce framework for analyzing

next-generation DNA sequencing data. Genome Res 20, 1297-303 (2010). 5. Koboldt, D.C. et al. VarScan: variant detection in massively parallel sequencing of individual and

pooled samples. Bioinformatics 25, 2283-5 (2009). 6. Ng, P.C. & Henikoff, S. SIFT: Predicting amino acid changes that affect protein function. Nucleic

Acids Res 31, 3812-4 (2003). 7. Sunyaev, S. et al. Prediction of deleterious human alleles. Hum Mol Genet 10, 591-7 (2001). 8. Yi, X. et al. Sequencing of 50 human exomes reveals adaptation to high altitude. Science 329,

75-8 (2010). 9. Sjoblom, T. et al. The consensus coding sequences of human breast and colorectal cancers.

Science 314, 268-74 (2006). 10. Getz, G. et al. Comment on "The consensus coding sequences of human breast and colorectal

cancers". Science 317, 1500 (2007). 11. Kuzmin, I. et al. Identification of the promoter of the human von Hippel-Lindau disease tumor

suppressor gene. Oncogene 10, 2185-94 (1995). 12. Zatyka, M. et al. Genetic and functional analysis of the von Hippel-Lindau (VHL) tumour

suppressor gene promoter. J Med Genet 39, 463-72 (2002). 13. Li, L.C. & Dahiya, R. MethPrimer: designing primers for methylation PCRs. Bioinformatics 18,

1427-31 (2002). 14. Rohde, C., Zhang, Y., Reinhardt, R. & Jeltsch, A. BISMA--fast and accurate bisulfite sequencing

data analysis of individual clones from unique and repetitive sequences. BMC Bioinformatics 11, 230 (2010).

15. Zhang, B., Kirov, S. & Snoddy, J. WebGestalt: an integrated system for exploring gene sets in various biological contexts. Nucleic Acids Res 33, W741-8 (2005).

16. Spizzo, G. et al. EpCAM expression in primary tumour tissues and metastases: an immunohistochemical analysis. J Clin Pathol 64, 415-20 (2011).

Nature Genetics: doi:10.1038/ng.1014

Page 12: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

Supplementary Figures Supplementary Figure 1. Fold coverage of targeted regions for the tumor and

normal control samples sequenced in the Discovery Screen. Reads with the same start

and end sites within the same library were eliminated to filter out potential PCR

duplication bias. N, normal control samples; T, tumor samples. a, The box plot depicts

the distribution of fold coverage of target regions across the 10 ccRCC patients. The bold

lines in boxes show the medians and the lines outside the boxes show the first or third

quartiles of fold coverage. b, The box plot depicts the distribution of fraction of targeted

bases covered by at least 1 reads and 10 reads across the 20 samples. The lines in boxes

show the medians and the lines outside the boxes show the first or third quartiles of

fraction of targeted bases covered by reads.

Nature Genetics: doi:10.1038/ng.1014

Page 13: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

Supplementary Figure 2. Frequent mutations of ubiquitin-mediated proteolysis

pathway (UMPP) in ccRCC. Somatic mutations in the HIF-α pathway (grey) and

UMPP (green) are shown. Yellow stars indicate the presence of somatic mutations, which

were confirmed by genotyping or Sanger sequencing. Numbers in parentheses refer to the

mutation prevalence of genes or pathway.

Nature Genetics: doi:10.1038/ng.1014

Page 14: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

Supplementary Figure 3. Representative immunohistochemical staining results of

HIF1α in ccRCC tissues. Expression levels of HIF1α protein were classified into four

subgroups: (A) -, no expression; (B) +, weak expression; (C) ++, moderate expression;

and (D) +++, intense expression.

Nature Genetics: doi:10.1038/ng.1014

Page 15: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

Supplementary Figure 4. Representative immunohistochemical staining results of

HIF2α in ccRCC tissues. Expression levels of HIF2α protein were classified into four

subgroups: -, no expression (A); +, weak expression (B); ++, moderate expression (C);

and +++, intense expression. D represented the negative control, without anti-HIF2α

primary antibody.

A B

C D

Nature Genetics: doi:10.1038/ng.1014

Page 16: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

Supplementary Tables

Supplementary Table 1. Clinical characteristics of the patients with ccRCC in the

Discovery Screen and Prevalence Screen

Case ID

Patient age (years)

Sex

TNM Classification* Screen

K1 32 M T1N0M0 Discovery K3 41 M T2N0M0 Discovery K20 54 M T1N0M0 Discovery K27 52 F T2N0M0 Discovery K29 54 M T1N0M0 Discovery K31 43 M T2N0M0 Discovery K32 55 F T1N0M0 Discovery K38 40 F T1N0M0 Discovery K44 48 F T2N0M0 Discovery K48 73 F T2N0M0 Discovery K6 40 M T1N0M0 Prevalence

K36 58 F T1NxMx Prevalence K41 60 M T2N0M0 Prevalence K43 54 M T2N0M0 Prevalence K50 35 M T1N0M0 Prevalence K51 40 M T1N0M0 Prevalence K53 68 F T2N0M0 Prevalence K55 52 M T4N0M0 Prevalence K56 51 M T1N0M0 Prevalence K66 56 F T1N0M0 Prevalence K67 77 M T1N0M0 Prevalence K69 54 F T1bN0M0 Prevalence K73 42 F T2N0M0 Prevalence K75 68 M T1N0M0 Prevalence K76 60 M T1N0M0 Prevalence K82 46 M T1NxM1 Prevalence K83 74 M T1N0M0 Prevalence K87 73 M T1aN0M0 Prevalence K90 60 F T1aN0M0 Prevalence K101 71 M T1N0M0 Prevalence K103 75 M T2N0M0 Prevalence K104 57 M T1N0M0 Prevalence

Nature Genetics: doi:10.1038/ng.1014

Page 17: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

K105 38 F T2N0M0 Prevalence K107 52 M T1aN0M0 Prevalence K108 76 M T1bN0M0 Prevalence K112 44 F T1N0M0 Prevalence K116 38 F T1N0M0 Prevalence K117 59 M T1N0M0 Prevalence K118 58 M T2bN0M0 Prevalence K119 62 F T1bN0M0 Prevalence K120 47 F T1aN0M0 Prevalence K122 62 M T1N0M0 Prevalence K124 50 F T1N0M0 Prevalence K126 72 M T1aN0M0 Prevalence K131 76 M T2N0M0 Prevalence K132 53 F T1bN0M0 Prevalence K133 47 F T2N0M0 Prevalence K136 61 M T2NxMx Prevalence K140 74 M T1N0M0 Prevalence K141 59 M T1N0M0 Prevalence K143 42 M T2N0M0 Prevalence K144 58 M T1aN0M0 Prevalence K148 69 F T1N0M0 Prevalence K149 67 F T1bN0M0 Prevalence K150 62 F T2N0M0 Prevalence K154 67 F T1N0M0 Prevalence K172 69 M T1bN0M0 Prevalence K174 30 M T1N0M0 Prevalence K176 36 M T1N0M0 Prevalence K185 61 M T3bNxMx Prevalence K195 54 F T1N0M0 Prevalence K196 59 F T2N0M0 Prevalence K198 72 F T1N0M0 Prevalence K216 54 M T1N0M0 Prevalence K218 63 M T1N0M0 Prevalence K220 28 M T1N0M0 Prevalence K232 43 M T1N0M0 Prevalence K233 59 M T2N0M0 Prevalence K234 59 M T2N0M0 Prevalence K236 69 F T2N0M0 Prevalence K245 75 F T1N0M0 Prevalence K246 65 F T2N0M0 Prevalence K248 58 M T2N0M0 Prevalence K249 50 F T1N0M0 Prevalence K251 51 F T1N0M0 Prevalence

Nature Genetics: doi:10.1038/ng.1014

Page 18: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

K257 39 M T1N0M0 Prevalence K264 42 F T1N0M0 Prevalence K265 65 M T1N0M0 Prevalence K277 64 F T4N0M0 Prevalence K280 65 M T1N0M0 Prevalence K287 70 F T1N0M0 Prevalence K294 29 F T2N0M0 Prevalence K295 48 M T1N0M0 Prevalence K304 42 M T1N0M0 Prevalence K338 58 M T1N0M0 Prevalence K339 80 M T1N0M0 Prevalence K340 42 F T1N0M0 Prevalence K341 70 F T1N0M0 Prevalence K344 44 F T1N0M0 Prevalence K347 41 F T1N0M0 Prevalence K349 55 M T2N0M0 Prevalence K368 57 M T1NXMX Prevalence K369 47 M T4NxM1 Prevalence K370 52 M T1N0M0 Prevalence K39 58 M T2N0M0 Prevalence K127 45 M T1aN0M0 Prevalence K180 60 M T1aN0M0 Prevalence K351 67 M T1N0M0 Prevalence

* The TNM Cancer staging system was designed to gauge the extent of Cancer in a patient's body. T

describes the size of the tumor and whether it has invaded nearby tissue, N describes regional lymph

nodes that are involved, and M describes distant metastasis (spread of cancer from one body part to

another). M, male; F, female.

Nature Genetics: doi:10.1038/ng.1014

Page 19: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

Supplementary Table 2. Summary statistics of exome sequencing data obtained

from 10 ccRCC patients in the Discovery Screen (a) and 88 patients in the

Prevalence Screen (b). This table lists the total number of sequencing reads obtained

from each of the two patient-matched samples, along with the number of reads that map

uniquely to the human reference genome (hg18), the number of reads that overlap with

the targeted regions and the number of reads left after filtering out duplicated reads with

same start and end positions. The fold-coverage of targets and the fraction of coverage of

targets are also shown. N, normal control; T, tumor.

(a)

Case ID Sample Total

Uniquely mapping

Overlapping target

Non-duplicated

Fold coverage

% of targets covere

d

T 127,306,38

8 119,831,64

9 94.13

% 56,944,105

44.73%

18,390,879 100.76 98.14% K1

N 135,657,23

0 125,260,25

9 92.34

% 76,326,538

56.26%

33,470,080 153.13 99.80%

T 102,125,49

5 94,924,056

92.95%

58,606,272 57.39

% 25,733,418 120.93 99.74%

K3 N

105,117,891

97,287,816 92.55

% 61,806,267

58.80%

25,665,404 129.59 99.78%

T 134,709,61

3 125,788,17

9 93.38

% 72,433,225

53.77%

28,812,064 150.72 99.79% K20

N 86,763,915 81,754,870 94.23

% 35,423,568

40.83%

17,908,092 60.65 98.49%

T 95,491,763 88,999,007 93.20

% 60,532,091

63.39%

23,605,720 127.79 99.52% K27

N 101,904,14

5 94,403,535

92.64%

59,657,922 58.54

% 25,872,762 124.95 99.58%

T 112,445,88

6 104,109,44

7 92.59

% 64,786,268

57.62%

26,499,867 136.87 99.75% K29

N 106,208,62

5 98,714,396

92.94%

60,216,090 56.70

% 25,831,956 122.83 99.67%

T 103,430,00

9 96,555,641

93.35%

59,750,125 57.77

% 26,951,886 121.97 99.64%

K31 N

105,910,498

98,570,394 93.07

% 60,773,847

57.38%

25,367,802 128.50 99.73%

T 104,104,02

1 97,120,170

93.29%

63,402,594 60.90

% 28,058,743 131.29 99.59%

K32 N

130,051,030

119,694,439

92.04%

71,943,004 55.32

% 31,169,910 146.26 99.67%

T 132,622,80

5 124,360,77

1 93.77

% 64,237,324

48.44%

24,143,979 134.46 99.56% K38

N 107,534,76

1 99,112,889

92.17%

66,190,237 61.55

% 23,989,854 141.99 99.60%

T 110,050,61

4 102,214,35

9 92.88

% 67,728,952

61.54%

29,998,135 140.02 99.63% K44

N 130,269,52

3 121,117,99

3 92.97

% 66,095,379

50.74%

24,197,787 136.51 99.66%

T 136,710,99

5 118,120,12

1 86.40

% 53,220,240

38.93%

26,040,647 105.03 99.36% K48

N 137,927,57 128,077,05 92.86 63,326,424 45.91 24,175,615 132.72 99.66%

Nature Genetics: doi:10.1038/ng.1014

Page 20: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

7 2 % %

Nature Genetics: doi:10.1038/ng.1014

Page 21: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

(b)

Case ID Samples Total Uniquely mapping Overlapping target Non-duplicated

Fold coverage

% of targets covered

T 10,919,571 10,557,180 96.68% 6,473,033 59.28% 1,848,878 100.07 99.12% K101 N 14,912,260 14,397,586 96.55% 9,788,798 65.64% 1,402,216 156.65 99.30%

T 9,816,682 9,513,486 96.91% 5,895,963 60.06% 1,784,950 90.77 99.14%

K103 N 10,597,195 10,276,461 96.97% 6,579,892 62.09% 1,822,058 100.87 99.17%

T 15,822,688 15,347,250 97.00% 11,271,463 71.24% 1,038,854 182.74 99.29%

K104 N 17,196,178 16,719,283 97.23% 11,068,816 64.37% 1,645,057 170.26 99.07%

T 10,355,462 9,990,495 96.48% 6,049,759 58.42% 1,856,398 93.95 99.22%

K105 N 17,683,946 17,177,403 97.14% 11,545,642 65.29% 1,627,872 178.09 99.16%

T 12,005,461 11,613,184 96.73% 7,459,552 62.13% 1,826,597 115.27 99.19%

K107 N 18,366,279 17,841,186 97.14% 12,074,472 65.74% 1,615,179 186.56 99.18%

T 9,771,638 9,451,910 96.73% 6,247,436 63.93% 1,821,394 98.27 99.32%

K108 N 17,780,223 17,283,657 97.21% 11,428,760 64.28% 1,650,706 177.37 99.25%

T 15,748,920 15,251,802 96.84% 10,444,123 66.32% 536,427 167.23 99.49%

K112 N 12,572,361 12,084,370 96.12% 7,681,349 61.10% 864,725 123.13 99.38%

T 9,748,924 9,395,937 96.38% 5,523,011 56.65% 1,736,882 86 99.23%

K116 N 16,144,983 15,658,198 96.98% 10,705,625 66.31% 1,550,007 170.03 99.26%

T 10,324,139 10,026,712 97.12% 6,850,772 66.36% 1,773,767 107.63 99.31%

K117 N 19,971,275 19,363,008 96.95% 13,492,022 67.56% 1,374,341 215.32 99.38%

T 9,645,942 9,352,249 96.96% 6,064,241 62.87% 1,804,432 94.34 99.12%

K118 N 15,478,449 15,053,608 97.26% 10,085,713 65.16% 1,675,195 155.36 99.22%

T 10,956,140 10,555,972 96.35% 6,255,214 57.09% 1,776,929 98 99.27%

K119 N 9,575,376 9,274,966 96.86% 5,982,129 62.47% 1,828,452 93.1 99.17%

T 16,068,739 15,585,295 96.99% 11,857,826 73.79% 923,002 193.12 99.33%

K120 N 8,753,592 8,464,276 96.69% 5,559,044 63.51% 1,807,828 87.14 99.25%

T 27,620,221 26,762,656 96.90% 17,878,114 64.73% 599,527 285.95 99.60%

K122 N 13,409,926 12,906,815 96.25% 8,154,240 60.81% 913,798 129.55 99.50%

T 9,776,475 9,489,259 97.06% 6,246,220 63.89% 1,777,353 97.69 99.27%

K124 N 18,449,116 17,913,092 97.09% 12,403,031 67.23% 1,592,099 194.4 99.35%

T 15,860,229 15,343,571 96.74% 9,838,867 62.03% 722,034 156.12 99.47%

K126 N 9,083,419 8,717,822 95.98% 5,567,588 61.29% 738,120 89.74 98.88%

T 17,472,816 16,973,040 97.14% 11,860,786 67.88% 1,517,280 184.34 99.30%

K127 N 10,152,542 9,738,100 95.92% 6,146,567 60.54% 784,398 98.05 99.31%

T 8,436,646 8,198,559 97.18% 5,459,983 64.72% 1,707,565 86.42 99.47%

K131 N 10,177,477 9,899,078 97.26% 6,937,499 68.17% 1,872,616 110.5 99.47%

T 10,512,256 10,207,000 97.10% 6,696,475 63.70% 1,799,973 103.93 99.20%

K132 N 17,967,508 17,440,880 97.07% 12,191,855 67.86% 1,535,197 193.54 99.35%

T 10,994,050 10,712,055 97.44% 7,057,949 64.20% 1,858,896 111.41 99.49%

K133 N 12,277,367 11,946,244 97.30% 8,296,668 67.58% 1,773,623 132.4 99.50%

T 9,602,594 9,307,417 96.93% 5,979,156 62.27% 1,819,864 92.92 99.19%

K136 N 14,092,856 13,686,519 97.12% 8,802,302 62.46% 1,716,965 137.82 99.37%

T 12,845,368 12,417,905 96.67% 8,020,096 62.44% 676,119 127.87 99.45%

K140 N 11,711,606 11,261,233 96.15% 7,304,148 62.37% 810,441 117.73 99.32%

T 9,947,236 9,618,755 96.70% 6,182,583 62.15% 1,855,509 96.97 99.30%

K141 N 9,481,866 9,181,204 96.83% 5,790,693 61.07% 1,798,322 88.54 98.78%

Nature Genetics: doi:10.1038/ng.1014

Page 22: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

T 9,635,349 9,338,798 96.92% 6,060,162 62.90% 1,808,127 94.72 99.24%

K143 N 9,788,459 9,480,858 96.86% 6,061,274 61.92% 1,806,110 92.77 98.85%

T 8,850,777 8,584,873 97.00% 5,305,251 59.94% 1,732,589 82.04 99.16%

K144 N 9,911,179 9,596,487 96.82% 6,106,835 61.62% 1,834,012 93.78 98.91%

T 15,784,853 15,263,844 96.70% 9,846,446 62.38% 726,890 158.31 98.99%

K148 N 16,279,138 15,646,371 96.11% 10,016,200 61.53% 852,865 162.18 99.08%

T 10,078,211 9,714,738 96.39% 5,674,321 56.30% 1,698,288 87.74 99.24%

K149 N 9,051,653 8,759,445 96.77% 5,538,014 61.18% 1,793,888 85.01 98.88%

T 10,098,849 9,770,291 96.75% 6,100,682 60.41% 1,808,291 93.87 99.23%

K150 N 10,981,147 10,624,385 96.75% 6,948,165 63.27% 1,827,951 106.76 99.07%

T 8,520,850 8,236,656 96.66% 4,663,070 54.73% 1,731,506 71.55 98.92%

K154 N 9,884,408 9,564,039 96.76% 6,208,379 62.81% 1,780,290 95.56 99.03%

T 16,829,343 16,320,993 96.98% 12,264,601 72.88% 933,674 200.2 99.41%

K172 N 14,701,132 14,115,215 96.01% 9,063,389 61.65% 855,328 145.79 99.39%

T 10,204,811 9,892,736 96.94% 6,790,253 66.54% 1,663,813 108.7 99.49%

K174 N 12,659,131 12,312,128 97.26% 8,650,076 68.33% 1,788,956 137.71 99.50%

T 23,825,128 23,021,707 96.63% 14,767,573 61.98% 730,306 234.97 99.52%

K176 N 13,602,257 13,051,521 95.95% 8,424,245 61.93% 803,247 135.83 99.21%

T 12,548,567 12,166,039 96.95% 8,341,166 66.47% 494,760 133.82 99.24%

K180 N 9,362,197 9,038,185 96.54% 5,938,966 63.44% 800,995 94.8 99.29%

T 14,164,256 13,703,826 96.75% 8,773,860 61.94% 706,690 139.88 99.49%

K185 N 24,307,791 23,455,499 96.49% 15,506,996 63.79% 961,123 248.13 99.40%

T 9,180,831 8,862,285 96.53% 5,440,539 59.26% 1,824,114 84.71 99.17%

K195 N 9,638,960 9,324,387 96.74% 5,874,065 60.94% 1,812,394 90.03 99.05%

T 15,708,609 15,182,307 96.65% 9,874,638 62.86% 697,643 158.15 99.46%

K196 N 6,444,676 6,215,269 96.44% 4,075,500 63.24% 672,722 65.16 98.99%

T 8,644,011 8,358,374 96.70% 5,112,705 59.15% 1,706,599 78.82 99.06%

K198 N 9,496,091 9,184,746 96.72% 5,951,324 62.67% 1,832,359 91.68 99.10%

T 13,662,091 13,224,257 96.80% 8,646,944 63.29% 1,804,384 134.54 99.34%

K216 N 8,668,846 8,370,242 96.56% 5,643,303 65.10% 1,494,319 90.1 99.22%

T 16,431,013 15,886,708 96.69% 10,086,283 61.39% 740,755 161.43 99.35%

K218 N 9,495,406 9,146,026 96.32% 5,917,115 62.32% 799,019 94.6 99.29%

T 11,114,482 10,771,667 96.92% 7,383,768 66.43% 1,799,057 115.72 99.33%

K220 N 10,376,136 10,027,355 96.64% 6,534,058 62.97% 1,848,346 101.74 99.16%

T 10,271,254 9,883,539 96.23% 5,821,939 56.68% 1,735,805 91.53 99.32%

K232 N 17,605,478 17,064,865 96.93% 10,321,235 58.63% 1,572,728 158.1 98.85%

T 7,730,051 7,490,895 96.91% 5,043,223 65.24% 490,983 80.98 97.70%

K233 N 7,782,244 7,510,327 96.51% 4,876,725 62.66% 761,153 77.66 99.29%

T 7,646,685 7,348,479 96.10% 4,471,348 58.47% 738,487 70.95 99.27%

K234 N 10,538,608 10,157,112 96.38% 6,609,103 62.71% 805,254 105.43 99.31%

T 13,220,869 12,778,259 96.65% 8,157,749 61.70% 729,217 129.71 99.47%

K236 N 14,783,968 14,257,711 96.44% 9,284,380 62.80% 896,953 148.69 99.33%

T 10,381,285 10,069,659 97.00% 6,706,407 64.60% 1,766,847 104.11 99.21%

K245 N 18,230,162 17,664,680 96.90% 11,052,481 60.63% 1,561,297 169.79 98.98%

T 8,982,667 8,691,769 96.76% 5,639,386 62.78% 1,782,810 88.29 99.19%

K246 N 20,510,587 19,884,940 96.95% 12,611,716 61.49% 1,499,950 194.05 99.09%

T 9,213,998 8,971,417 97.37% 6,104,920 66.26% 1,726,366 97.68 99.46%

K248 N 9,655,962 9,369,858 97.04% 6,535,664 67.69% 1,610,515 104.58 99.49%

T 10,627,853 10,297,298 96.89% 6,751,240 63.52% 467,894 107.65 99.45%

K249 N 12,531,641 12,080,228 96.40% 7,735,812 61.73% 886,576 123.33 99.35%

Nature Genetics: doi:10.1038/ng.1014

Page 23: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

T 11,217,775 10,885,388 97.04% 7,400,668 65.97% 1,733,803 117.86 99.50%

K251 N 14,873,410 14,477,964 97.34% 10,331,831 69.47% 1,773,695 163.87 99.52%

T 10,998,417 10,618,741 96.55% 6,672,384 60.67% 1,878,149 104.21 99.23%

K257 N 19,327,179 18,756,952 97.05% 12,189,993 63.07% 1,495,421 187.36 99.10%

T 9,639,079 9,363,254 97.14% 6,169,678 64.01% 1,763,665 95.55 99.25%

K264 N 18,349,871 17,802,881 97.02% 11,222,296 61.16% 1,557,398 173.1 99.17%

T 8,904,361 8,618,156 96.79% 5,285,990 59.36% 1,803,912 81.82 99.12%

K265 N 20,493,914 19,895,707 97.08% 12,807,002 62.49% 1,469,644 197.43 99.26%

T 10,507,455 10,169,233 96.78% 6,474,005 61.61% 1,880,561 100.85 99.26%

K277 N 19,246,358 18,676,418 97.04% 12,049,808 62.61% 1,526,404 185.87 99.33%

T 9,981,944 9,706,200 97.24% 6,334,589 63.46% 1,788,725 97.62 99.14%

K280 N 20,135,377 19,544,103 97.06% 12,679,189 62.97% 1,485,629 197.36 99.33%

T 9,293,049 8,980,677 96.64% 5,632,038 60.60% 1,832,277 87.53 99.21%

K287 N 20,001,883 19,392,661 96.95% 12,752,066 63.75% 1,503,805 198.83 99.32%

T 11,761,729 11,455,784 97.40% 7,859,129 66.82% 1,934,149 122.23 99.47%

K294 N 10,161,803 9,868,217 97.11% 6,818,266 67.10% 1,683,252 108.78 99.50%

T 9,443,206 9,123,267 96.61% 5,498,420 58.23% 1,845,052 85.26 99.11%

K295 N 9,667,437 9,343,975 96.65% 5,946,975 61.52% 1,875,319 92.99 99.27%

T 10,585,172 10,226,008 96.61% 6,388,425 60.35% 1,819,078 99.34 99.25%

K304 N 8,431,454 8,155,693 96.73% 4,989,769 59.18% 1,774,193 76.6 99.15%

T 10,202,292 9,842,082 96.47% 5,820,362 57.05% 1,728,628 90.36 99.27%

K338 N 15,789,371 15,297,934 96.89% 9,591,005 60.74% 1,650,372 147.69 98.92%

T 9,856,187 9,546,431 96.86% 6,356,799 64.50% 430,854 101.98 98.58%

K339 N 12,164,060 11,738,048 96.50% 7,673,165 63.08% 882,582 122.39 99.34%

T 12,256,739 11,855,449 96.73% 8,089,925 66.00% 468,583 130.51 98.76%

K340 N 17,084,841 16,465,061 96.37% 10,776,097 63.07% 904,539 172.01 99.39%

T 10,717,326 10,334,940 96.43% 6,176,442 57.63% 1,745,979 95.93 99.29%

K341 N 18,585,117 17,987,593 96.78% 11,537,517 62.08% 1,611,225 178.09 99.01%

T 16,520,931 15,992,407 96.80% 10,778,663 65.24% 542,308 172.33 99.52%

K344 N 18,577,199 17,986,080 96.82% 11,647,102 62.70% 1,616,381 180.74 98.99%

T 9,495,375 9,159,697 96.46% 5,652,300 59.53% 1,796,570 88.47 99.17%

K347 N 9,955,296 9,622,538 96.66% 6,024,518 60.52% 1,886,979 94.27 99.30%

T 10,288,326 9,931,718 96.53% 6,114,954 59.44% 1,831,323 95.22 99.21%

K349 N 9,194,115 8,884,934 96.64% 5,568,054 60.56% 1,832,534 85.76 99.19%

T 13,822,438 13,297,059 96.20% 8,297,354 60.03% 883,586 132.45 99.33%

K351 N 17,643,287 17,106,716 96.96% 10,791,487 61.16% 1,622,653 166.4 99.03%

T 12,365,777 11,965,665 96.76% 8,227,574 66.54% 461,564 132.38 98.22%

K368 N 18,111,658 17,533,998 96.81% 11,439,399 63.16% 1,613,905 177.86 99.11%

T 17,490,983 16,973,172 97.04% 12,898,056 73.74% 939,502 210.51 99.30%

K369 N 8,866,291 8,586,536 96.84% 5,315,254 59.95% 1,801,458 81.9 99.22%

T 7,390,749 7,170,142 97.02% 4,647,134 62.88% 1,723,683 72.8 99.11%

K36 N 14,617,778 14,208,377 97.20% 10,378,484 71.00% 1,054,749 167.55 99.26%

T 13,006,142 12,593,261 96.83% 8,642,496 66.45% 487,180 139.18 99.06%

K370 N 13,539,814 13,111,876 96.84% 7,816,959 57.73% 1,672,131 121.46 99.05%

T 8,233,116 7,962,359 96.71% 5,072,351 61.61% 1,784,143 78.94 99.13%

K39 N 8,166,961 7,893,790 96.66% 4,963,947 60.78% 1,796,396 77.28 99.09%

T 9,803,154 9,455,381 96.45% 5,525,535 56.36% 1,733,671 85.8 99.28%

K41 N 16,668,324 16,213,835 97.27% 12,004,383 72.02% 993,323 193.61 99.36%

T 9,988,137 9,706,138 97.18% 6,321,642 63.29% 1,770,259 97.47 99.17%

K43 N 14,985,494 14,567,600 97.21% 10,769,580 71.87% 1,111,481 174 99.26%

Nature Genetics: doi:10.1038/ng.1014

Page 24: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

T 9,962,824 9,686,338 97.22% 6,112,702 61.36% 1,766,966 94.1 99.09%

K50 N 15,834,533 15,417,462 97.37% 10,267,680 64.84% 1,594,207 160.57 99.40%

T 6,718,083 6,517,045 97.01% 3,983,242 59.29% 1,640,059 61.98 99.05%

K51 N 15,627,549 15,141,165 96.89% 9,657,955 61.80% 1,707,179 148.21 99.03%

T 16,224,986 15,747,907 97.06% 11,455,959 70.61% 1,067,752 184.91 99.31%

K53 N 18,475,330 17,881,765 96.79% 11,999,076 64.95% 1,629,926 186.98 99.21%

T 5,942,354 5,770,622 97.11% 4,357,154 73.32% 1,039,578 70.93 99.12%

K55 N 16,757,943 16,237,421 96.89% 10,419,428 62.18% 1,713,203 160.2 99.18%

T 10,651,031 10,343,062 97.11% 6,848,858 64.30% 1,769,455 105.62 99.24%

K56 N 16,018,304 15,509,177 96.82% 9,943,809 62.08% 1,721,791 152.91 99.18%

T 9,142,863 8,855,604 96.86% 5,534,865 60.54% 1,748,376 85.76 99.09%

K66 N 11,256,972 10,839,273 96.29% 6,246,907 55.49% 1,729,906 97.88 99.16%

T 8,675,677 8,402,823 96.85% 5,214,773 60.11% 1,734,116 80.19 99.17%

K67 N 9,354,050 9,065,434 96.91% 5,666,373 60.58% 1,794,174 88.12 99.15%

T 11,641,030 11,301,492 97.08% 7,726,105 66.37% 1,888,019 122.28 99.49%

K69 N 14,065,363 13,660,391 97.12% 9,365,826 66.59% 1,888,406 146.26 99.49%

T 9,172,818 8,878,267 96.79% 5,616,501 61.23% 1,781,015 86.38 99.23%

K6 N 15,210,562 14,753,615 97.00% 10,521,033 69.17% 954,771 170.2 98.96%

T 16,692,429 16,206,593 97.09% 11,995,595 71.86% 1,014,287 193.78 99.36%

K73 N 9,005,165 8,728,798 96.93% 5,384,643 59.80% 1,786,885 82.5 99.10%

T 10,790,182 10,408,554 96.46% 6,156,646 57.06% 1,713,488 95.21 99.24%

K75 N 17,004,653 16,448,111 96.73% 11,033,978 64.89% 1,638,416 173.58 99.25%

T 10,859,990 10,479,702 96.50% 6,373,278 58.69% 1,776,740 98.67 99.30%

K76 N 13,530,046 13,067,781 96.58% 8,985,263 66.41% 1,399,737 144.38 99.18%

T 8,170,428 7,941,906 97.20% 5,970,418 73.07% 939,249 97.07 98.73%

K82 N 11,763,854 11,378,380 96.72% 7,124,858 60.57% 1,838,587 110.73 99.27%

T 9,071,361 8,760,455 96.57% 5,545,245 61.13% 628,871 88.42 99.38%

K83 N 17,901,149 17,297,770 96.63% 11,597,055 64.78% 1,505,935 184.97 99.26%

T 10,435,161 10,085,937 96.65% 6,214,964 59.56% 1,820,136 96.22 99.19%

K87 N 16,463,955 15,932,932 96.77% 10,473,023 63.61% 1,705,283 163.19 99.28%

T 14,867,391 14,386,277 96.76% 9,197,343 61.86% 724,905 146.22 99.52%

K90 N 17,010,646 16,459,555 96.76% 10,742,370 63.15% 1,661,058 168.75 99.29%

Nature Genetics: doi:10.1038/ng.1014

Page 25: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

Supplementary Table 3. Details of the predicted somatic mutations detected in the

Discovery Screen (see the excel spreadsheet).

Supplementary Table 4. A list of selected genes for targeted exon re-sequencing in

the Prevalence Screen (see the excel spreadsheet).

Supplementary Table 5. A list of confirmed somatic mutations detected in the 98

ccRCCs (see the excel spreadsheet).

Nature Genetics: doi:10.1038/ng.1014

Page 26: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

Supplementary Table 6. Significance of the observed mutation rate over the

expected mutation rate.

Non-silent somatic changes

Gene symbol

Missense Nonsense/

splice site/indel

Total non-silent mutations

Patients harboring non-silent mutations

P value (passenger

probability)

VHL 6 21 27 27 1.56E-71 PBRM1 2 19 21 20 2.83E-31

JARID1C 2 7 9 9 9.76E-11 BAP1 2 7 9 8 1.45E-15

LRP1B 8 0 8 7 7.63E-09 TP53 3 3 6 6 3.34E-11

SYNE2 5 1 6 6 1.07E-04 CSMD3 6 1 7 5 5.02E-08 AKAP13 5 0 5 5 3.80E-05 SPTBN4 4 0 4 4 5.02E-03 SETD2 0 4 4 4 5.03E-03 RYR1 4 0 4 4 2.29E-02 NAV3 4 0 4 4 1.90E-04

CARD11 5 0 5 4 1.18E-05 AHNAK 7 2 9 4 9.29E-09

ZNF804A 2 1 3 3 1.63E-02 TSC1 0 3 3 3 1.34E-02

SHANK1 2 1 3 3 9.20E-03 MLL3 2 1 3 3 9.28E-02

LRRK2 3 0 3 3 4.28E-04 FMN2 3 0 3 3 4.26E-03

FAM111B 1 2 3 3 8.25E-03 CUL7 2 1 3 3 3.66E-02 ASB15 1 2 3 3 1.27E-04

ZNF800 2 0 2 2 5.31E-03 ZNF16 2 0 2 2 3.80E-03 UBR5 0 2 2 2 1.02E-01 TTN 2 0 2 2 2.79E-01

TRPC6 2 0 2 2 6.44E-03 TRIP11 2 0 2 2 3.66E-02 TNKS 1 1 2 2 2.06E-02 SMC6 0 2 2 2 1.85E-01

SFTPD 2 0 2 2 8.56E-03 ROS1 2 0 2 2 1.93E-01 RIF1 0 2 2 2 7.66E-02 RET 1 1 2 2 8.12E-02 PZP 1 1 2 2 4.83E-02

PRDM16 1 1 2 2 8.95E-02 PPFIA1 2 0 2 2 6.44E-02

Nature Genetics: doi:10.1038/ng.1014

Page 27: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

PCM1 1 1 2 2 3.12E-01 MUC7 2 0 2 2 2.00E-03 MLLT4 2 0 2 2 9.46E-02 MLL2 1 1 2 2 3.91E-01

MAP3K1 2 0 2 2 2.90E-02 LGR5 2 0 2 2 3.08E-02

KNTC1 2 0 2 2 8.22E-02 KIAA0564 2 0 2 2 1.62E-02

KDR 2 0 2 2 3.17E-02 KCNA3 2 0 2 2 1.38E-02 ITPR2 2 0 2 2 1.05E-01

HERC2 2 0 2 2 3.69E-01 HERC1 2 0 2 2 2.35E-01

HECTD2 1 1 2 2 7.19E-03 GAS7 2 0 2 2 9.07E-03

FRAP1 2 0 2 2 1.30E-01 EPS15 3 0 3 2 7.28E-04

DOCK5 2 0 2 2 4.58E-02 DNAH8 2 0 2 2 2.37E-01

DIS3 2 0 2 2 3.07E-02 DENND1A 1 1 2 2 1.92E-01

DCP1B 1 1 2 2 8.13E-03 COG4 2 0 2 2 2.57E-02 CLYBL 2 0 2 2 9.34E-03

CHTF18 2 0 2 2 2.22E-02 CEP97 2 0 2 2 3.58E-02

CCDC60 2 0 2 2 3.79E-02 CAD 2 0 2 2 1.66E-01

C10orf137 2 0 2 2 3.92E-02 BTRC 2 0 2 2 1.07E-02

ATRNL1 2 0 2 2 3.17E-02 APC 2 0 2 2 1.57E-01

AGXT 2 0 2 2 1.37E-02 ADCY9 2 0 2 2 5.33E-02 ZNF672 1 0 1 1 3.57E-01 ZNF521 1 0 1 1 3.37E-01

ZNF285A 1 0 1 1 3.41E-01 ZFHX3 0 1 1 1 9.62E-01 ZC3H14 1 0 1 1 3.33E-01 WWP2 1 0 1 1 3.93E-01

WDR21A 1 0 1 1 3.95E-01 USP6 1 0 1 1 3.26E-01

USP21 1 0 1 1 1.87E-01 UBQLN1 1 0 1 1 2.29E-01 UBE4B 1 0 1 1 5.80E-01 UBE3B 1 0 1 1 3.88E-01

UBE2Q2 1 0 1 1 1.70E-01 UBAC1 0 1 1 1 1.51E-03 UBA1 1 0 1 1 5.51E-01

Nature Genetics: doi:10.1038/ng.1014

Page 28: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

TRIP12 0 1 1 1 4.11E-01 TRIM24 1 0 1 1 3.45E-01 TRAF6 1 0 1 1 2.20E-01 TPH2 1 0 1 1 2.36E-01

TNPO1 0 1 1 1 2.76E-01 TMEM130 1 0 1 1 3.53E-01

THBS3 0 1 1 1 1.68E-01 TFRC 0 1 1 1 7.40E-01 TET2 0 1 1 1 4.52E-01

TECTA 0 1 1 1 6.78E-01 TBC1D2 0 1 1 1 1.57E-01 STAU1 0 1 1 1 2.30E-01 STAT6 1 0 1 1 5.17E-01

SRGAP3 1 0 1 1 2.92E-01 SOX9 0 1 1 1 6.75E-02 SNX13 1 0 1 1 4.46E-01

SLC25A13 1 0 1 1 2.31E-01 SETDB2 1 0 1 1 2.33E-01 SETD1B 0 1 1 1 8.72E-01 SCRIB 1 0 1 1 5.79E-01 SBNO1 0 1 1 1 8.57E-01

SAMD9L 1 0 1 1 5.56E-01 RUNX2 1 0 1 1 4.35E-01 RPN1 1 0 1 1 4.26E-01 RGS7 1 0 1 1 2.19E-01 RB1 1 0 1 1 3.38E-01

RAPGEF4 1 0 1 1 3.19E-01 RANBP17 1 0 1 1 4.50E-01

RAI14 0 1 1 1 1.70E-01 RAD54B 0 1 1 1 7.69E-01 PTPRZ1 1 0 1 1 4.77E-01 PTPRK 0 1 1 1 5.18E-01 PTPRJ 1 0 1 1 3.70E-01

PTPN13 1 0 1 1 4.87E-01 PTPN12 1 0 1 1 1.40E-01

PSORS1C1 0 1 1 1 2.14E-04 PRPF6 0 1 1 1 1.64E-01 PPM1G 0 1 1 1 8.40E-02 POLN 0 1 1 1 2.79E-01 POLI 0 1 1 1 7.21E-01

PLXNA4 1 0 1 1 6.15E-01 PLEKHG6 0 1 1 1 1.33E-01

PKP2 1 0 1 1 2.64E-01 PIK3CA 1 0 1 1 3.45E-01 PDE6C 1 0 1 1 2.53E-01

PDE4DIP 1 0 1 1 4.47E-01 PCK2 1 0 1 1 3.28E-01

PCDHAC1 0 1 1 1 4.16E-01 NUP54 0 1 1 1 2.32E-03

Nature Genetics: doi:10.1038/ng.1014

Page 29: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

NUP188 1 0 1 1 3.84E-01 NSD1 1 0 1 1 5.02E-01

NR3C1 1 0 1 1 2.71E-01 NOTCH1 1 0 1 1 7.08E-01 NLRP9 1 0 1 1 2.61E-01

NIN 1 0 1 1 3.86E-01 NF2 0 1 1 1 4.24E-01

NCRNA00174 0 1 1 1 2.60E-06 NCOA2 1 0 1 1 5.76E-01

NCAPD2 1 0 1 1 6.03E-01 MYT1 0 1 1 1 1.99E-01

MYST4 1 0 1 1 4.72E-01 MYO3A 0 1 1 1 4.47E-01

MYB 1 0 1 1 2.13E-01 MVK 1 0 1 1 3.30E-01

MUC15 1 0 1 1 1.40E-01 MTCH2 1 0 1 1 1.34E-01 MSH2 0 1 1 1 3.34E-01 MN1 1 0 1 1 2.24E-01

MMP3 1 0 1 1 1.96E-01 MGAT5 1 0 1 1 2.83E-01 MED30 1 0 1 1 2.04E-01 MDM2 1 0 1 1 3.06E-01

MCOLN2 1 0 1 1 3.84E-01 MAP3K7IP2 1 0 1 1 4.59E-01

LRCH1 0 1 1 1 1.16E-01 LILRB2 1 0 1 1 1.85E-01 KTN1 1 0 1 1 4.38E-01

KIAA1432 0 1 1 1 1.06E-02 KIAA1409 1 0 1 1 6.22E-01

ITPR1 1 0 1 1 5.05E-01 ITCH 0 1 1 1 4.39E-01

ISYNA1 0 1 1 1 8.24E-02 IRX6 0 1 1 1 3.34E-01 IL6ST 0 1 1 1 2.97E-01 IL27 0 1 1 1 2.79E-02

IKBKAP 1 0 1 1 2.70E-01 HYAL3 1 0 1 1 1.35E-01 HUWE1 1 0 1 1 5.27E-01

HSP90AB1 1 0 1 1 4.17E-01 HNRNPUL1 1 0 1 1 4.37E-01

HK2 1 0 1 1 2.35E-01 HERC3 1 0 1 1 5.06E-01

HEATR4 1 0 1 1 2.88E-01 GTPBP10 1 0 1 1 2.18E-01

GMPS 1 0 1 1 2.72E-01 GDI1 1 0 1 1 1.99E-01

GALNTL6 1 0 1 1 2.54E-01 FGR 1 0 1 1 3.68E-01

Nature Genetics: doi:10.1038/ng.1014

Page 30: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

FANCD2 1 0 1 1 2.37E-01 FAM92B 1 0 1 1 2.72E-01

F5 1 0 1 1 6.40E-01 EPHA7 0 1 1 1 7.94E-01 ELF4 0 1 1 1 7.06E-01

DYNC1H1 1 0 1 1 7.33E-01 DYM 1 0 1 1 3.90E-01

DOCK1 0 1 1 1 4.19E-01 DDX21 1 0 1 1 4.13E-01 CUL3 1 0 1 1 2.85E-01 CUL1 1 0 1 1 4.11E-01

CREBBP 1 0 1 1 6.74E-01 CR2 0 1 1 1 3.37E-01

CPXM1 1 0 1 1 3.75E-01 COPB1 1 0 1 1 4.78E-01 CNGB3 1 0 1 1 2.69E-01 CLSPN 0 1 1 1 8.49E-01 CHD1 1 0 1 1 6.00E-01

CERKL 1 0 1 1 2.61E-01 CCBP2 1 0 1 1 1.73E-01

CBL 1 0 1 1 5.15E-01 CASP5 1 0 1 1 1.89E-01

CALML6 1 0 1 1 2.02E-01 C1QTNF9 1 0 1 1 3.17E-01

BTNL2 0 1 1 1 1.79E-03 BRIP1 1 0 1 1 3.56E-01 BRD4 0 1 1 1 3.03E-01

BRCA1 1 0 1 1 4.23E-01 BMX 1 0 1 1 2.12E-01 BLM 1 0 1 1 5.34E-01

BIRC6 1 0 1 1 7.20E-01 BIRC2 1 0 1 1 2.27E-01

B4GALNT1 0 1 1 1 6.55E-01 ATP13A2 0 1 1 1 2.05E-01

ATM 0 1 1 1 9.54E-01 ATL2 1 0 1 1 3.04E-01 ARNT 1 0 1 1 4.45E-01

ARID1A 1 0 1 1 6.70E-01 ARHGEF12 1 0 1 1 5.28E-01 ARHGEF11 1 0 1 1 6.27E-01 ARHGAP20 0 1 1 1 2.36E-01

AKAP4 1 0 1 1 4.24E-01 ADAMTS20 1 0 1 1 5.46E-01

ABTB2 1 0 1 1 2.36E-01 ABCB8 0 1 1 1 1.19E-01

Nature Genetics: doi:10.1038/ng.1014

Page 31: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

Supplementary Table 7. Methylation status of VHL promoter in tumors and

matched normal control samples.

Sample ID Number of CpGs

successfully sequenced

Number of unmethylated

CpGs

Number of methylated

CpGs

Methylation rate of

CpGs(%)

Methylation rate>10% (T or

N)

log2Ratio (T/N)

P-value

K31T 619 612 7 1.1

K31N 966 954 12 1.2 No - -

K41T 963 949 14 1.5

K41N 1033 1021 12 1.2 No - -

K44T 699 684 15 2.1

K44N 727 712 15 2.1 No - -

K73T 631 612 19 3

K73N 1748 1723 25 1.4 No - -

K82T 899 880 19 2.1

K82N 1825 1778 47 2.6 No - -

K83T 965 953 12 1.2

K83N 1928 1901 27 1.4 No - -

K90T 1034 1025 9 0.9

K90N 1680 1647 33 2 No - -

K112T 1004 997 7 0.7

K112N 1783 1763 20 1.1 No - -

K117T 931 920 11 1.2

K117N 1615 1586 29 1.8 No - -

K119T 895 889 6 0.7

K119N 902 882 20 2.2 No - -

K122T 892 891 1 0.1

K122N 759 740 19 2.5 No - -

K127T 966 957 9 0.9

K127N 816 802 14 1.7 No - -

K131T 895 886 9 1

K131N 464 449 15 3.2 No - -

K143T 987 972 15 1.5

K143N 1816 1799 17 0.9 No - -

K154T 1024 1011 13 1.3

K154N 727 715 12 1.7 No - -

K172T 1028 1011 17 1.7

K172N 1709 1682 27 1.6 No - -

K174T 897 887 10 1.1

K174N 820 811 9 1.1 No - -

K185T 963 956 7 0.7

K185N 1667 1642 25 1.5 No - -

K198T 1033 1023 10 1

K198N 1008 998 10 1 No - -

K218T 831 822 9 1.1

K218N 1977 1944 33 1.7 No - -

K257T 589 580 9 1.5

K257N 1790 1724 66 3.7 No - -

Nature Genetics: doi:10.1038/ng.1014

Page 32: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

K347T 1036 1011 25 2.4

K347N 876 849 27 3.1 No - -

K339T 896 890 6 0.7

K339N 842 816 26 3.1 No - -

K344T 828 818 10 1.2

K344N 826 796 30 3.6 No - -

K369T 951 937 14 1.5

K369N 782 765 17 2.2 No - -

K370T 879 842 37 4.2

K370N 966 952 14 1.4 No - -

K1T 884 870 14 1.6

K1N 756 746 10 1.3 No - -

K3T 897 853 44 4.9

K3N 1799 1740 59 3.3 No - -

K6T 978 961 17 1.7

K6N 840 823 17 2 No - -

K27T 893 876 17 1.9

K27N 1017 998 19 1.9 No - -

K29T 839 825 14 1.7

K29N 1883 1805 78 4.1 No - -

K32T 810 795 15 1.9

K32N 965 960 5 0.5 No - -

K36T 833 826 7 0.8

K36N 1695 1649 46 2.7 No - -

K38T 825 807 18 2.2

K38N 964 951 13 1.3 No - -

K39T 1034 1019 15 1.5

K39N 1034 1013 21 2 No - -

K43T 942 704 238 25.3

K43N 1740 1696 44 2.5 Yes 3.339137385 < 2.2E-16

K50T 941 930 11 1.2

K50N 1033 1021 12 1.2 No - -

K51T 653 637 16 2.5

K51N 1850 1826 24 1.3 No - -

K53T 1023 1007 16 1.6

K53N 1796 1770 26 1.4 No - -

K55T 1008 987 21 2.1

K55N 1563 1537 26 1.7 No - -

K56T 726 697 29 4

K56N 1631 1605 26 1.6 No - -

K66T 1977 1953 24 1.2

K66N 2030 1995 35 1.7 No - -

K67T 878 861 17 1.9

K67N 1675 1630 45 2.7 No - -

K69T 965 945 20 2.1

K69N 1798 1761 37 2.1 No - -

K75T 773 752 21 2.7

K75N 1828 1802 26 1.4 No - -

K76T 875 853 22 2.5

K76N 1936 1884 52 2.7 No - -

Nature Genetics: doi:10.1038/ng.1014

Page 33: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

K87T 836 823 13 1.6

K87N 1930 1861 69 3.6 No - -

K101T 995 976 19 1.9

K101N 1833 1755 78 4.3 No - -

K103T 951 941 10 1.1

K103N 1892 1838 54 2.9 No - -

K104T 852 836 16 1.9

K104N 986 971 15 1.5 No - -

K105T 964 959 5 0.5

K105N 1643 1624 19 1.2 No - -

K107T 895 880 15 1.7

K107N 926 911 15 1.6 No - -

K108T 748 740 8 1.1

K108N 1781 1755 26 1.5 No - -

K116T 693 690 3 0.4

K116N 1875 1857 18 1 No - -

K118T 825 821 4 0.5

K118N 1702 1670 32 1.9 No - -

K120T 919 910 9 1

K120N 978 964 14 1.4 No - -

K124T 1036 978 58 5.6

K124N 1578 1555 23 1.5 No - -

K126T 973 833 140 14.4

K126N 1767 1744 23 1.3 No - -

K133T 966 941 25 2.6

K133N 1025 1013 12 1.2 No - -

K136T 703 678 25 3.6

K136N 1022 996 26 2.5 No - -

K140T 903 791 112 12.4

K140N 804 792 12 1.5 Yes 3.047305715 < 2.2E-16

K141T 894 780 114 12.8

K141N 1780 1747 33 1.9 Yes 2.752072487 < 2.2E-16

K144T 567 552 15 2.6

K144N 753 744 9 1.2 No - -

K148T 513 498 15 2.9

K148N 626 618 8 1.3 No - -

K149T 487 478 9 1.8

K149N 960 949 11 1.1 No - -

K150T 658 640 18 2.7

K150N 674 668 6 0.9 No - -

K176T 1033 1019 14 1.4

K176N 1600 1576 24 1.5 No - -

K195T 906 880 26 2.9

K195N 1872 1850 22 1.2 No - -

K196T 1036 1022 14 1.4

K196N 967 958 9 0.9 No - -

K216T 978 699 279 28.5

K216N 1900 1861 39 2.1 Yes 3.762500686 < 2.2E-16

K220T 1034 1024 10 1

K220N 1503 1475 28 1.9 No - -

Nature Genetics: doi:10.1038/ng.1014

Page 34: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

K232T 1005 993 12 1.2

K232N 1785 1767 18 1 No - -

K233T 914 911 3 0.3

K233N 701 693 8 1.1 No - -

K234T 1035 1026 9 0.9

K234N 1792 1771 21 1.2 No - -

K236T 977 559 418 42.8

K236N 1821 1779 42 2.3 Yes 4.21790503 < 2.2E-16

K245T 799 785 14 1.8

K245N 1770 1698 72 4.1 No - -

K246T 974 961 13 1.3

K246N 1770 1725 45 2.5 No - -

K248T 1008 999 9 0.9

K248N 931 897 34 3.7 No - -

K249T 1034 1018 16 1.5

K249N 1769 1723 46 2.6 No - -

K251T 1034 1018 16 1.5

K251N 1892 1842 50 2.6 No - -

K264T 872 847 25 2.9

K264N 1033 1000 33 3.2 No - -

K265T 822 808 14 1.7

K265N 1847 1785 62 3.4 No - -

K277T 917 894 23 2.5

K277N 1755 1706 49 2.8 No - -

K280T 898 875 23 2.6

K280N 823 793 30 3.6 No - -

K287T 1034 1024 10 1

K287N 1743 1709 34 2 No - -

K294T 1031 1029 2 0.2

K294N 1578 1516 62 3.9 No - -

K295T 894 737 157 17.6

K295N 828 805 23 2.8 Yes 2.652076697 < 2.2E-16

K304T 828 822 6 0.7

K304N 896 865 31 3.5 No - -

K338T 961 946 15 1.6

K338N 1936 1872 64 3.3 No - -

K340T 1003 990 13 1.3

K340N 884 860 24 2.7 No - -

K341T 971 966 5 0.5

K341N 705 684 21 3 No - -

K349T 829 817 12 1.4

K349N 882 862 20 2.3 No - -

K351T 827 820 7 0.8

K351N 827 801 26 3.1 No - -

K368T 687 681 6 0.9

K368N 1657 1625 32 1.9 No - -

N, normal control; T, tumor

Nature Genetics: doi:10.1038/ng.1014

Page 35: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

Supplementary Table 8. Pathways that significantly enriched with somatic variations in the Discovery Screen.

KEGG pathways that are enriched with somatic mutations were determined by the bioinformatics tool of WebGestalt. Pathways with

FDR values less than 0.05 were shown.

Pathway name The number of

reference genes in the category

Number of genes in the gene set and also in the category

Expected number in the

category

Ratio of enrichment

P value from hypergeometric

test FDR

# of affected samples

Ubiquitin mediated proteolysis 138 6 0.7 8.61 7.67E-05 0.0013 5

Pathways in cancer 330 6 1.3 4.63 0.002 0.0164 3

Neurotrophin signaling pathway 126 4 0.49 8.09 0.0016 0.0164 4

ECM-receptor interaction 84 3 0.33 9.1 0.0045 0.0219 3

Colorectal cancer 84 3 0.33 9.1 0.0045 0.0219 2 Progesterone-mediated oocyte

maturation 86 3 0.34 8.89 0.0048 0.0219 3

Dilated cardiomyopathy 92 3 0.36 8.31 0.0058 0.0238 3

GnRH signaling pathway 101 3 0.4 7.57 0.0075 0.028 2

Focal adhesion 201 4 0.79 5.07 0.0083 0.0284 3

Oocyte meiosis 114 3 0.45 6.7 0.0104 0.031 2

Regulation of actin cytoskeleton 216 4 0.85 4.72 0.0106 0.031 2

Basal cell carcinoma 55 2 0.22 9.26 0.0199 0.0442 1

Vibrio cholerae infection 56 2 0.22 9.1 0.0205 0.0442 2

Endometrial cancer 52 2 0.2 9.8 0.0179 0.0442 1

Insulin signaling pathway 137 3 0.54 5.58 0.017 0.0442 2

MAPK signaling pathway 269 5 1.36 3.68 0.0122 0.0415 4

Amyotrophic lateral sclerosis (ALS) 53 2 0.21 9.61 0.0185 0.0442 2

Nature Genetics: doi:10.1038/ng.1014

Page 36: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

Supplementary Table 9. The immunohistochemical analysis of HIF1α and HIF2α in

tumors and matched morphologically normal renal tissues.

Case ID HIF1α

expression in tumors*

HIF1α expression in

normal

HIF2α expression in

tumors

HIF2α expression in

normal

Genetically or epigenetically altered genes in UMPP

K1 ++ - + +

K101 + + - -

K103 ++ + ++ +

K104 - - - -

K105 + + - -

K107 NA$ NA NA NA

K108 + - + -

K112 + - ++ + HERC1, VHL

K116 ++ + + +

K117 ++ + ++ ++ VHL

K118 - - + +

K119 ++ - ++ + VHL

K120 +++ + ++ +

K122 + + + + VHL

K124 + + + +

K126 - - + + BTRC

K127 ++ - + + VHL, CUL7

K131 +++ - ++ ++ UBR5, VHL

K132 + - ++ - VHL

K133 ++ - + -

K136 +++ - ++ + UBE2Q2

K140 NA NA NA NA BIRC2,VHL#

K141 - - - - UBE4B,VHL#

K143 - - - - VHL

K144 - - + +

K148 - - - -

K149 - - - -

K150 NA NA NA NA

K154 ++ - ++ + VHL,HUWE1

K172 ++ + ++ + VHL

K174 ++ + + - VHL

K176 NA NA NA NA

K180 ++ + ++ + CUL3

K185 + - + - VHL

K195 + - ++ +

K196 - - - - BAP1,CUL1

K198 + - + + VHL,TRIP12

K20 - - - + UBE3B

K216 - - - - HERC2,VHL#

K218 ++ - + + VHL

K220 ++ - + -

Nature Genetics: doi:10.1038/ng.1014

Page 37: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

K232 ++ - + +

K233 - - - -

K234 + - + -

K236 NA NA NA NA BRCA1,VHL#

K245 - - + +

K246 - - + +

K248 - - + +

K249 + - + -

K251 + - + - MAP3K1

K257 + - - - VHL,BAP1

K264 ++ - - -

K265 + - + - BAP1

K27 - - - -

K277 ++ - + - TRAF6,HERC1

K280 ++ - + -

K6 - - - -

K287 NA NA NA NA

K29 - - + +

K294 +++ + - -

K295 - - - - ITCH, VHL#

K3 + + + + UBR5,BIRC6

K304 ++ - + - BAP1

K31 + + ++ + VHL

K32 - - - -

K338 + - + - CUL7

K339 + - - - VHL

K340 + - - - BAP1

K341 + - + -

K344 - - - - VHL

K347 ++ + - - VHL,BTRC,CBL,WWP2,HERC3,UBA1

K349 - - - -

K351 + + + +

K36 - - + +

K368 + - + -

K369 ++ - ++ - VHL

K370 ++ - + - VHL

K38 + + + +

K39 ++ + + -

K41 - - - - VHL

K43 + - ++ - BAP1, VHL#

K44 - - + + BAP1,VHL

K48 + - ++ + CUL7,HERC2

K50 ++ - ++ - BAP1

K51 + + + +

K53 + + ++ +

K55 - - - -

K56 + + + +

K66 + + + +

K67 - - - - MAP3K1

Nature Genetics: doi:10.1038/ng.1014

Page 38: Supplementary Information: Frequent mutations of …...Supplementary Information: Frequent mutations of genes encoding ubiquitin-mediated proteolysis pathway components in clear cell

K69 - - + +

K73 NA NA NA NA VHL

K75 NA NA NA NA

K76 ++ + ++ + MDM2

K82 ++ - ++ ++ VHL

K83 - - ++ + VHL

K87 + - + +

K90 ++ - ++ + VHL

* Expression levels of HIF1α or HIF2α protein were classified into four subgroups: -, no expression of HIF1α or

HIF2α; +, weak expression of HIF1α or HIF2α; ++, moderate expression of HIF1α or HIF2α ; and +++, intense

expression of HIF1α or HIF2α.

$ NA, not available.

# VHL promoter was hypermethylated in the tumor relative to their matched normal sample.

Nature Genetics: doi:10.1038/ng.1014


Top Related