According to the WHO research in 2008, colorectal cancer caused approximately 8% of all cancer deaths worldwide. Only particular set of genes is responsible for its occurrence. Their increased or decreased expression levels cause the cells in the colorectal region not to work properly, i.e. the processes they are associated with are disrupted. This research aims to unveil those genes and make a model which is going to determine whether one patient is carcinogenic. We propose a realistic modeling of the gene expression probability distribution and use it to calculate the Bayesian posterior probability for classification. We developed a new methodology for obtaining the best classification results. The gene expression profiling is done by using the DNA microarray technology. In this research, 24,526 genes were being monitored at carcinogenic and healthy tissues equally. We also used SVMs and Binary Decision Trees which resulted in very satisfying correctness.
DNA microarray machine learning colorectal cancer Bayes’ theorem posterior probability Support Vector Machines Binary Decision Trees