The use of genome-wide data in cancer research, for the identification of groups of patients with similar molecular characteristics, has become a standard approach for applications in therapy-response, prognosis-prediction, and drug-development. To progress in these applications, the trend is to move from single genome-wide measurements in a single cancer-type towards measuring several different molecular characteristics across multiple cancer-types. Although current approaches shed light on molecular characteristics of various cancer-types, detailed relationships between patients within cancer clusters are unclear. We propose a novel multi-omic integration approach that exploits the joint behavior of the different molecular characteristics, supports visual exploration of the data by a two-dimensional landscape, and inspection of the contribution of the different genome-wide data-types. We integrated 4,434 samples across 19 cancer-types, derived from TCGA, containing gene expression, DNA-methylation, copy-number variation and microRNA expression data. Cluster analysis revealed 18 clusters, where three clusters showed a complex collection of cancer-types, squamous-cell-carcinoma, colorectal cancers, and a novel grouping of kidney-cancers. Sixty-four samples were identified outside their tissue-of-origin cluster. Known and novel patient subgroups were detected for Acute Myeloid Leukemia’s, and breast cancers. Quantification of the contributions of the different molecular types showed that substructures are driven by specific (combinations of) molecular characteristics.
|Number of pages||13|
|Publication status||Published - 25 Apr 2016|
Bibliographical noteConcerning DOI 10.1038/s41598-018-35518:This Article contains a typographical error in the spelling of the author Wim Verhaegh, which is incorrectly given as Wim Verheagh.
Correct version has been uploaded
- Data integration
- Data mining
- Functional clustering
- OA-Fund TU Delft