Integrative analysis of mutated genes and mutational processes reveals novel mutational biomarkers in colorectal cancer
Background: Colorectal cancer (CRC) is one of the leading causes of cancer-related deaths worldwide. Recent studies have observed causative mutations in susceptible genes related to colorectal cancer in 10 to 15% of the patients. This highlights the importance of identifying mutations for early detection of this cancer for more effective treatments among high risk individuals. Mutation is considered as the key point in cancer research. Many studies have performed cancer subtyping based on the type of frequently mutated genes, or the proportion of mutational processes. However, to the best of our knowledge, combination of these features has never been used together for this task. This highlights the potential to introduce better and more inclusive subtype classification approaches using wider range of related features to enable biomarker discovery and thus inform drug development for CRC.
Results: In this study, we develop a new pipeline based on a novel concept called 'gene-motif', which merges mutated gene information with tri-nucleotide motif of mutated sites, for colorectal cancer subtype identification. We apply our pipeline to the International Cancer Genome Consortium (ICGC) CRC samples and identify, for the first time, 3131 gene-motif combinations that are significantly mutated in 536 ICGC colorectal cancer samples. Using these features, we identify seven CRC subtypes with distinguishable phenotypes and biomarkers, including unique cancer related signaling pathways, in which for most of them targeted treatment options are currently available. Interestingly, we also identify several genes that are mutated in multiple subtypes but with unique sequence contexts.
Conclusion: Our results highlight the importance of considering both the mutation type and mutated genes in identification of cancer subtypes and cancer biomarkers. The new CRC subtypes presented in this study demonstrates distinguished phenotypic properties which can be effectively used to develop new treatments. By knowing the genes and phenotypes associated with the subtypes, a personalized treatment plan can be developed that considers the specific phenotypes associated with their genomic lesion.