Classification and analysis of TCGA Luminal Cancer datasets

Here, I analyze the TCGA datasets available for 311 patients with breast cancer classified as subtype luminal A or luminal B. These two cancers are quite difficult to distinguish by histological analysis, and the main distinguishing characteristic appears to be that luminal B grows slightly faster and has a slightly worse prognosis than luminal A cancer.


Data Preparation

Exploratory Analysis, PCA, UMAP

K-means clustering

Regularized Logistic Regression

Ensemble Classification

Expression Visualization

Written on February 2, 2021