Feature Extraction Modelling And Classification

Project Title:

Advanced Feature Extraction, Modelling and Classification Method for Systems Biology

Project Description:

Systems biology is a diverse area of research that pulls together a range of disciplines such as Physics, Biology and Computer Science from Engineering. The system biology usually has the following properties:
1. Small samples. In conventional data-driven models from engineering, various sensors can usually be attached to the system fairly easily to collect as many samples as needed. Unfortunately, measurement for biological systems invariably involve destroying the actual system or cells, which means that sample size in biological data sets are small (typically the number of samples n<100). The problem is exacerbated for the case of gene expression modelling, where measurements from high-throughput DNA microarrays can simultaneously measure the expression of thousands of genes (or variables p). Therefore one of the main challenges for applying these techniques to system biology is to overcome what is known as the ‘p » n’ problem.
2. Variant correlation. Usually when attempting to model a system or process in engineering, highly correlated variables are considered redundant and in order to obtain a compact interpretable model, only one of the correlated variables is assigned a parameter, while the redundant variables are discarded. However, in systems biology, many important variables can be highly correlated. A common biological example is the problem of gene selection from microarray data (where p » n). Analysis of this type gene expression data shows that genes share certain biological ‘pathways’ where they are co-regulated. Genes in the same pathway can therefore be naturally grouped together as they exhibit high correlations. When performing regression in these cases it is important that the entire group of genes is added to the model, rather than each one in isolation. Thus another challenge is to model with grouped-variables.
To better understand of numerous biological processes and eventually promises great progress in healthcare and medication, the following task need to be performed:
1. Development of new feature extraction for system biology.
2. Development of new group modelling method for system biology.
3. Development of new classification method for system biology.