Mutations in TP53 and other heterogeneous genes help to distinguish metastases from new primary malignancies

Background: Distinguishing metastases from new primary malignancies or vice versa is important because misclassification can result in inappropriate management. However, for some cases this distinction can be challenging, particularly for squamous cell carcinomas in which the usual surgical pathology approach, predominantly morphology and immunohistochemistry, are frequently non-contributory. We analysed tumor-associated mutations in order to determine whether they could help with this diagnostic dilemma. Methods: Mutations in specific genes were identified with cBioPortal, a large publically available tumor sequence data set. Genes were selected based upon either their high overall prevalence of mutation, or their inclusion in an in-house tumor sequencing set. Tumor types analysed included various common adenocarcinomas, squamous cell carcinomas from multiple sites, urothelial carcinoma and melanoma. Individual mutations and sets of mutations within gene cohorts were compared by their diversity (or heterogeneity) index, prevalence and cumulative prevalence. We demonstrated the utility of this method by performing in-house sequencing of candidate genes in tumors from three patients for which morphology and immunohistochemistry were unable to distinguish between a metastasis and a new primary malignancy. Results: Sequence data from relatively small cohorts of candidate genes readily identified highly diverse, low prevalence mutation profiles in most common malignancies including squamous cell carcinomas. The diversity index predicted the likelihood of an identical mutation profile occurring in an unrelated tumor. High yield gene cohorts could be predicted based on the primary tumor type. Most cohorts included TP53 due to both its high mutation prevalence and high mutation index of diversity. Identical, low prevalence mutations in multiple tumors from patients in the three case studies provided strong diagnostic certainty for metastases rather than new primary malignancies. Conclusions: Most common tumors, including squamous cell carcinomas, have a readily identifiable mutation profiles that occur at a sufficiently low prevalence to effectively barcode or fingerprint the tumor. An identical mutation profile in a primary tumor and a new lesion provides strong evidence for a metastasis and effectively excludes a new primary malignancy, providing diagnostic confidence and aiding clinical management certainty.

insertions/deletions) has recently become relatively cost-effective and rapid due to advance in DNA sequencing. This has enabled comprehensive sets of cancer exomes to be generated, many of which are now freely and publically available, as well as reliable detection of these mutations in individual patient tumor samples.
Data shown here indicates that most primary malignancies have mutation profiles that are of sufficiently low frequency to "barcode" and track subsequent metastases. Surprisingly, relatively few genes are necessary to provide useful quantitative data to help distinguish a metastasis from a second primary tumor. Three cases studies are provided to illustrate this point. Readily available database sequence information was used to predict other genes that will be useful for this purpose in other typically problematic tumors including squamous cell carcinoma.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 16, 2020. ; https://doi.org/10.1101/2020. 11.11.20226845 doi: medRxiv preprint Microsoft Excel pivot tables were used to determine the prevalence of mutations in individual genes, and in combinations of genes. The expected prevalence of two different mutations was calculated as the product of their individual prevalence. The diversity, or heterogeneity, of sequences was quantified by Simpson's index D, where D = ) 1 ( [43] and was expressed as the index of diversity, 1-D [44]. In this equation n = the number of tumors with each sequence combination, and N = number of all tumors in the set. The index of diversity was termed the "overall heterogeneity" for the set of all tumors (i.e. including those with wild type sequences). When only tumors with mutated sequences were considered the index of diversity was termed the "mutation-specific heterogeneity".

In house next generation sequencing
For the three case studies formalin-fixed paraffin-embedded (FFPE) material was enriched for tumor cells via microdissection. DNA was extracted using either the DNeasy Blood and Tissue Kit or EZ1 DNA Tissue Kit (Qiagen) according tomanufacturer's instructions. DNA quality was assessed by an in-house designed qPCR assay based on the Ct values of 100bp and 300bp amplicons of non-cancer related genes.
Library enrichment was performed by multiplex PCR on 48x48 Access Array Microfluidics chips (Fluidigm) followed by sequencing of a 138 amplicon panel. This panel was designed inhouse to be FFPE-compatible and targets mutation hotspots in 19 genes with prognostic and/or therapeutic relevance (Table 1). Where possible, mutation hotspots were covered with multiple redundant amplicons. Data was analysed via an in-house developed pipeline. Nucleotide sequences in FASTQ format were demultiplexed with bcl2fastq Conversion Software v1.8.4 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 16, 2020. ; https://doi.org/10.1101/2020.11.11.20226845 doi: medRxiv preprint

Low prevalence mutation profiles are present in many tumors
The prevalence of mutations was assessed in an eight-gene subset (termed cohort-8) from the 19 genes that are frequently sequenced for prognostic and predictive data at our institution.
Mutation profiles in cohort-8 were generally useful to distinguish metastases from new primary tumors. For example, 55.1% of all tumors had a mutation profiles in cohort-8 with a prevalence of <2%, (Fig.1B), in other words occurring in less than 1 in 50 other tumors. Furthermore, 45.9% of tumors had cohort-8 mutation profile prevalences of <0.5%, i.e. occurring in <1/200 other tumors (Fig.1B). There was substantial variation between different tumor types in the proportion of tumors that had low prevalence mutation profiles. The proportion was very high for ovarian and colorectal carcinomas in which about 80% of tumors had a mutation profile . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 16, 2020. ; https://doi.org/10.1101/2020.11.11.20226845 doi: medRxiv preprint prevalence <2%, but was very low for prostatic and papillary thyroid carcinomas in which <15% of tumors had a mutation profile prevalence <10% (Fig.1B).
The proportion of tumors with low prevalence mutation profiles was a consequence of the overall heterogeneity of genes in cohort-8. Simpson's index of diversity [43,44] was used to quantify heterogeneity. This index expresses the probability that two randomly selected tumors will have different mutation profiles. An index of diversity of 1 represents infinite heterogeneity, and an index of 0 represents no heterogeneity. Overall heterogeneity is contributed to by both the prevalence of mutations and mutation-specific heterogeneity. A set of tumors dominated by only one or two mutation profiles will be less heterogeneous than a set composed of numerous different mutation profiles. The heterogeneity of cohort-8 was high in tumor subtypes that had abundant low prevalence mutations (compare Fig.1B and 1C). For example, heterogeneity was >98% for ovarian and colorectal carcinomas in which cohort-8 showed both highly prevalent and highly heterogeneous mutations. Tumors with only highly prevalent mutations (e.g. papillary thyroid carcinoma) or highly heterogeneous mutations (e.g. prostate) in cohort-8, but not vice versa, had a low overall tumor heterogeneity (20% for prostate carcinoma and 60% for papillary thyroid carcinoma).

TP53 heterogeneity
When all 3753 malignancies were considered together, the greatest contribution to the heterogeneity of cohort-8 was from TP53. TP53 was the most frequently mutated gene (prevalence of 35.8%) and had the most diverse set of mutations (mutation-specific heterogeneity of 0.99) resulting in the highest gene diversity (overall heterogeneity of 0.58).
Over 98% of individual TP53 mutations had prevalences of <2%, and about a third of tumors (32.9%) had a TP53 mutation with a prevalence of <1%. Mutations were widely distributed . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 16, 2020. ; https://doi.org/10.1101/2020.11.11.20226845 doi: medRxiv preprint throughout the TP53 sequence with some low amplitude hotspots (Fig.2C), as previously described [45]. The mean prevalence of TP53 mutations ranged from 0.25% (prostate, standard deviation of 0.06%) to 0.63% (pancreas, standard deviation of 0.44%). The commonest TP53 mutation overall was R175H, with an overall prevalence of 1.73% and the only mutation to have a prevalence of >3.5% in a specific tumor subtype (7.4% of colorectal adenocarcinomas, Fig.2C). The mutation-specific heterogeneity of KIT and PDGFRA was even greater than for TP53, 1.0 and 0.9997, respectively (Fig.1E), as all, or almost all mutations in these genes were unique. However, mutations were rare for both KIT and PDGFRA (prevalence of 3.7% and 1.6%, respectively) resulting in low overall heterogeneity (0.032 and 0.041), as this analysis includes wild-type sequences. The prevalence of BRAF mutations was relatively high at 12% (Fig.1E). However, 78% of all BRAF mutations were V600E, which resulted in a low overall heterogeneity (overall index of diversity of 0.22). AKT had the lowest overall heterogeneity (0.023) due to both low mutation prevalence (1.15%) and low mutation-specific heterogeneity (0.52). 70% of all AKT mutations were E17K.
In about a third of all tumors (34.1%) sequencing TP53 alone identified a mutation profile that would occur in <1/50 other tumors (prevalence of <2%, Fig.1D). There were also important but relatively minor contributions to the overall heterogeneity of cohort-8 from PIK3CA and KRAS, for which a mutation profile prevalence of <2% was present in 9.4% and 4.8% of all tumors, respectively (Fig.1D). A mutation profile prevalence of <2% was only present in 24.1% of tumors when TP53 was removed, compared to 55.1% when TP53 was included ( Fig.2B). TP53 by itself provided a useful low prevalence mutation profile in most tumor types ( Fig.2C). For colorectal adenocarcinomas the mutation profile prevalence of TP53 alone was similar to that for all other seven genes together (not shown). However, TP53 was less useful for endometrioid and papillary thyroid carcinomas, and melanoma (Fig.2C). This was due to . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 16, 2020. ; https://doi.org/10.1101/2020.11.11.20226845 doi: medRxiv preprint low prevalence rather than low mutation-specific heterogeneity (not shown). When all the malignancies were considered together, the three genes with the greatest heterogeneity (TP53, PIK3CA, KRAS, Fig.1E) together provided a mutation profile prevalence nearly equivalent to cohort-8 (Fig.2D).

Identifying other heterogeneous genes in adenocarcinomas
In order to improve mutation profiling, cBioPortal data from common tumor types was used to identify genes with frequent sequencing-detectable mutations ( Table 2). For many tumor types most of these genes have been shown to be drivers of tumorigenesis and therefore likely maintained in metastases. TP53 remained one of the most heterogeneous genes in many tumor types, with exception of endometrioid and papillary thyroid carcinoma and melanoma Where possible, a new cohort for each tumor subtype was constructed from this data (cohort-5) and compared to cohort-8. Heterogeneous driver genes with large coding sequences were frequently identified in ovarian and lung adenocarcinoma. Large size likely reduced the success of reliably obtaining their sequence. Also, for melanoma four of the top five most frequently mutated genes had no known role in neoplasia. They are likely UV-induced passenger mutations that accumulate in part due of their large size. Genes of uncertain driver status were also relatively common in lung adenocarcinoma. Therefore ovarian and lung adenocarcinomas and melanomas and were excluded from further analysis, although for serous ovarian carcinoma the predictive value of the original 8-gene cohort was already very high and unlikely to be improved.
For most adenocarcinomas low prevalence mutation profiles were more easily identified with cohort-5 compared to cohort-8 (Fig.2E). For example, a mutation profile prevalence of <2% was present in 95.1% of colorectal carcinomas with cohort-5, compared to in 80.0% of tumors with cohort-8. The exception was papillary thyroid carcinoma in which a mutation profile prevalence of <2% was present in only about 8% of tumors using either cohort-5 or cohort-8 Sequence-detected mutations were highly prevalent in papillary thyroid carcinoma (80.5%), . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 16, 2020. ; https://doi.org/10.1101/2020.11.11.20226845 doi: medRxiv preprint but their diversity remained low as most remained restricted to BRAF V600E (58.1%) and the new genes in cohort-5 added very little to the overall heterogeneity.

Identifying heterogeneous genes in squamous cell carcinomas
Distinguishing between poorly differentiated carcinomas with either squamous morphology or p63/p40 expression is generally more difficult than for adenocarcinomas. The differential diagnosis includes squamous cell carcinoma (SCC), urothelial carcinoma, and sometimes other carcinomas (e.g. salivary gland, metaplastic breast). It can be very difficult to separate primary versus metastatic SCC. Frequently mutated genes common to SCCs from head and neck, lung, skin and esophagus, and urothelial carcinoma were identified from cBioPortal data in order to develop a useful mutation profiling cohort. As above, genes of unproven driver status or with large coding sequences were excluded from further analysis. This resulted in a 3-gene subset (cohort-SCC) composed of TP53, CDKN2A and PIK3CA. One of these three genes was mutated in >72% of SCCs (Fig.3A). The overall diversity of cohort-SCC was high, with a heterogeneity index of >0.92 in all SCCs due to a very high mutation-specific heterogeneity (>0.99, Fig.3A). Mutation-specific heterogeneity of cohort-SCC was also very high in urothelial carcinomas (0.99), although the prevalence of mutation was lower than for SCCs (53%), resulting in a lower overall heterogeneity (index of 0.78). Again, TP53 was the most frequently mutated of the three genes in both SCC and urothelial carcinoma (Fig.3B). The mean prevalence for individual TP53 mutations ranged between 0.49% (skin SCC, standard deviation = 0.13%) to 0.67% (urothelial carcinoma, standard deviation = 0.58%, Fig.3C). The commonest TP53 mutation overall was R248W, with a prevalence of 4.0% in urothelial carcinoma (Fig.3C). Clinically useful data is generated with cohort-SCC. A mutation profile prevalence of <2% was present in at least 72% of SCCs from most sites (there were too few skin SCCs to calculate this figure) and a prevalence of <5% (i.e. occurring by chance in 1/20 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 16, 2020. ; https://doi.org/10.1101/2020.11.11.20226845 doi: medRxiv preprint different SCCs) was present in 72% of esophageal SCCs, >82% of SCCs from the other sites, and 53% of urothelial carcinomas (Fig.3D).

Case histories
The bioinformatics analysis above suggests mutations of sufficiently low prevalence can effectively "barcode" primary malignancies and track their metastases. The usefulness of this method is illustrated by 3 case studies. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 16, 2020. ; https://doi.org/10.1101/2020.11.11.20226845 doi: medRxiv preprint Analysis of cBioPortal mutation data [1,2] from 2549 adenocarcinomas relevant to males in showed that while mutation of either TP53 or KRAS was common from colorectal, lung and other sites, TP53 R282W or KRAS Q61H were relatively rare (Fig.4C). The combination of TP53 R282W with KRAS Q61H was not identified in any of adenocarcinomas including 365 colorectal adenocarcinomas. This was not surprising considered that the combination of TP53 R282W and KRAS Q61H, if independent, was predicted to present in <0.035% for any adenocarcinoma subtype (Fig.4D). The low prevalence of the TP53 R282W and KRAS Q61H combination strongly supported the diagnosis of metastatic rectal adenocarcinoma over a new primary lung adenocarcinoma.

Case 2
A female in her 30's presented with numerous liver metastases. An avid transverse colon lesion was detected by FDG-PET but was not evident on CT imaging. A diagnostic laparoscopy was converted to an open right hemicolectomy and omentectomy. Histology showed a poorly differentiated tumor invading the colonic wall (Fig.5A) which in immunostains was pan-cytokeratin+ ( Fig.5B) and CK7-CK20+ CDX2+ (not shown) consistent with a primary colonic adenocarcinoma. There was also extensive undifferentiated omental tumor that raised the possibility of a second malignancy (Fig.5C). The omental tumor was negative for numerous cytokeratins (Fig.5D), as well for CDX2 and numerous markers of melanocytic and haematolymphoid differentiation (not shown). Mismatch repair protein expression was preserved in both components. In house DNA equencing showed that the colonic adenocarcinoma and undifferentiated omental tumor both had the same point mutations in TP53 (524G>A, R175H) and BRAF (1799T>A, V600E). An incidental serrated colonic polyp from the patient was wildtype for TP53 arguing against a germline mutation.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review) preprint
The copyright holder for this this version posted November 16, 2020. ; https://doi.org/10.1101/2020.11.11.20226845 doi: medRxiv preprint Analysis of cBioPortal data from 3688 adenocarcinomas relevant to females showed that the frequency of TP53 R175H and BRAF V600E were both highest in colorectum, 7.4% and 6.8% respectively (Fig.5E). TP53 R175H and BRAF V600E are predicted to occur in combination in about 0.5% of colorectal carcinomas (if independent, Fig.5F). However, this combination was not found in any of these 3688 adenocarcinomas, including 365 colorectal adenocarcinomas (1.85 tumors were expected in the colorectal set). The low prevalence of TP53 R175H with BRAF V600E combination strongly supported the omental malignancy to be metastatic undifferentiated tumor from the colonic adenocarcinoma.

Case 3
A male in his 80's, an ex-smoker, presented with a pleurally-based left lung mass. This was one year after a pT4 N1 trans-glottic poorly differentiated squamous cell carcinoma had been treated by total laryngectomy and radiotherapy. A core biopsy of the lung tumor showed a poorly differentiated non-small cell carcinoma with strong p40 and CK5/6 expression, and weak TTF-1 expression (not shown). Next generation sequencing of TP53, CDKN2A and PIK3CA showed a PIK3CA H1047R mutation (3140A>G) in both the lung and laryngeal tumors. Analysis of mutation data in cBioPortal showed that PIK3CA H1047R mutations were present in only 2.51% SCCs from the head and neck (<1 in 40 tumors) and in 1.12% from lung (about 1 in 89 tumors). PIK3CA H1047R mutations were also uncommon in other p63/p40positive tumors, present in 1.33% of esophageal SCCs, none of 29 metastatic skin SCCs, and in 0.80% urothelial carcinomas. This data provided strong support that the pleural tumor was a metastasis from the larynx and not a primary lung SCC or a metastasis from an unknown primary.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

DISCUSSION
Metastases are often readily and reliably distinguished from new primary malignancies by a combination of clinical, radiological, morphological and immunohistochemical information.
However, sometimes this is not possible, particularly for poorly differentiated tumors. The data presented shows that mutations of specific genes can be helpful to distinguish metastases from a new second malignancy. This appears to be a relatively straightforward analysis, using next generation sequencing methodology and publically available comprehensive tumor mutations data sets. It is possible that this analysis may be improved by using nucleotide rather than amino acid change, and incorporating large scale genomic changes (e.g. amplifications, deletions, translocations, inversions) or DNA modifications such as methylation. Potential limitations could results from poor tumor DNA quality (e.g. sample age, fixation and necrosis), sequencing errors, difficultly classifying variants as pathogenic, mutation-to-wild type reversions and expansion of rare sub-clones that lack truncal mutations.
The case histories illustrate specific situations in which sequence data from a small gene panel clarified conventional pathology information and guided appropriate patient management. In case 1, the solitary lung mass and mediastinal lymphadenopathy suggested primary lung adenocarcinoma. The initial biopsy site was a cytology specimen from one of the lymph nodes, and with current emphasis to limit immunohistochemistry (e.g. TTF-1, p63/p40) in order to preserve material for molecular testing for targeted therapies, this could have resulted in a diagnosis of lung adenocarcinoma. In case 2 the molecular analysis supported a diagnosis of metastatic undifferentiated tumor from the previously identified colonic adenocarcinoma. This was potentially a high stage tumor from an unknown primary, and thus improving the diagnostic certainty potentially limited unnecessary further investigation and therapy. Case 3 illustrates how molecular data can help distinguish metastatic SCC from a primary lung SCC, a . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review) preprint
The copyright holder for this this version posted November 16, 2020. ; https://doi.org/10.1101/2020.11.11.20226845 doi: medRxiv preprint common dilemma for pathologists in which morphology and immunohistochemistry are typically unhelpful.
For many tumor types the gene with the highest overall index of diversity was the tumor suppressor TP53 which had both a high prevalence of mutations and a highly diverse array of mutations, as noted previously [45]. Almost all amino acids in TP53 have been found to be mutated in cancers [45]. In contrast, some genes contributed very little to tumor tracking despite being frequently mutated because the mutant subset had a low index of diversity (e.g.

BRAF V600E
). In this regard KRAS was surprisingly heterogeneous because although many KRAS mutations were codon 12/13 variants, they consisted of multiple amino acid changes occurring at similar frequencies, resulting in a surprisingly high index of diversity. TP53 mutations by themselves were sufficiently diverse that about a third of all tumors had a mutation that would only have 1% likelihood of occurring in a second malignancy. In other words identifying the same TP53 mutation in two different tumors would only occur 1 in 100 times by chance. Therefore, if a germline mutation is excluded, then such data would strongly support the two tumors to be related.
TP53 mutations typically occur early during neoplasia and persist in metastases [46,47]. Some TP53 mutations may actually promote metastasis [48]. In colorectal adenocarcinoma, mutations in genes other than TP53 are almost always detected in their metastases [11]. This included >99% of the mutations analysed in this study (not shown). Persistence of mutations in metastases is also likely to be generally true for most malignancies as most of these mutations are strong drivers of tumorigenesis.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

(which was not certified by peer review) preprint
The copyright holder for this this version posted November 16, 2020. ; https://doi.org/10.1101/2020.11.11.20226845 doi: medRxiv preprint Diverse mutation profiles were difficult to identify for some tumor types including prostate adenocarcinoma and papillary thyroid carcinoma. However, in clinical practise distinguishing metastatic prostate or thyroid carcinoma from a new primary tumor is usually reliably done with conventional H&E morphology and immunohistochemistry. In contrast distinguishing between metastatic squamous cell carcinoma and a new squamous malignancy is often not possible as the morphology and immunoprofiles of SCCs are often similar. This is not an uncommon clinical dilemma because patients often have risk factors for SCC at several sites (e.g. smoking for oropharyngeal, laryngeal and lung SCC). Compared to adenocarcinomas there are no tissue-of-origin immunostains for SCC, apart from perhaps GATA3 and p16 in specific circumstances. Sequence data can be used to infer aetiology from UV, smoking or HPV, but in contrast to mutation profiling this requires a deeper bioinformatic analysis and does not distinguish between two tumors with the same aetiology.
A potential future approach based upon this study would be similar to that in the case histories, with the modification that the set of genes chosen for sequencing would be those predicted to be the most diverse in the primary malignancy. If the primary malignancy and second tumor have the same mutation profile then the prevalence of the profile in large publically available online data sets such cBioPortal will provide a quantitative estimate of how likely the second tumor is to be a metastasis. Refinement of this method may be to limit the sequencing to the most heterogeneous exons and not whole coding regions, and by avoiding GC-rich sequences.
However, the distribution of mutations throughout the TP53 suggests that the entire coding sequence of this gene will remain important to this type of analysis.

CONCLUSIONS
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 16, 2020. ; https://doi.org/10.1101/2020.11.11.20226845 doi: medRxiv preprint Most common tumors, including squamous cell carcinomas, have a readily identifiable set of mutations, or mutation profiles that occurs at a sufficiently low prevalence to effectively barcode or fingerprint the tumor. The same mutation profile in the primary tumor and a new lesion provides strong evidence for a metastasis and effectively excludes a new primary malignancy.

Consent for publication
The data from cases studies were part of standard of care for diagnostic, prognostic and predictive purposes. Research use of this data as shown here was approved by the Peter MacCallum Cancer Centre Ethics committee (project #03/90).

Availability of data and materials
All data not shown is available on request.

Competing interests
None.

Funding
None.
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 16, 2020. ; https://doi.org/10.1101/2020.11.11.20226845 doi: medRxiv preprint whole or part based upon data generated by the TCGA Research Network (http://cancergenome.nih.gov/). is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 16, 2020. ; https://doi.org/10.1101/2020.11.11.20226845 doi: medRxiv preprint prevalence, overall heterogeneity and mutation-specific heterogeneity for each gene in cohort-8.   . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 16, 2020. ; https://doi.org/10.1101/2020. 11.11.20226845 doi: medRxiv preprint Expected prevalence of the co-occurrence of TP53 R282W and KRAS Q61H mutations in adenocarcinomas from various sites.  . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 16, 2020. ; https://doi.org/10.1101/2020.11.11.20226845 doi: medRxiv preprint . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 16, 2020. ; https://doi.org/10.1101/2020.11.11.20226845 doi: medRxiv preprint