Colorectal cancer (CRC) is one of the leading causes of cancer-related deaths in the world. It has been reported that ∼10%-15% of individuals with colorectal cancer experience a causative mutation in the known susceptibility genes, highlighting the importance of identifying mutations for early detection in high risk individuals. Through extensive sequencing projects such as the International Cancer Genome Consortium (ICGC), a large number of somatic point mutations have been identified that can be used to identify cancer-associated genes, as well as the signature of mutational processes defined by the tri-nucleotide sequence context (motif) of mutated sites. Mutation is the hallmark of cancer genome, and many studies have reported cancer subtyping based on the type of frequently mutated genes, or the proportion of mutational processes, however, none of these cancer subtyping methods consider these features simultaneously. This highlights the need for a better and more inclusive subtype classification approach to enable biomarker discovery and thus inform drug development for CRC. In this study, we developed a statistical pipeline based on a novel concept ‘gene-motif’, which merges mutated gene information with tri-nucleotide motif of mutated sites, to identify cancer subtypes, in this case CRCs. Our analysis identified for the first time, 3,131 gene-motif combinations that were significantly mutated in 536 ICGC colorectal cancer samples compared to other cancer types, identifying seven CRC subtypes with distinguishable phenotypes and biomarkers. Interestingly, we identified several genes that were mutated in multiple subtypes but with unique sequence contexts. Taken together, our results highlight the importance of considering both the mutation type and mutated genes in identification of cancer subtypes and cancer biomarkers.