Accessibility Navigation:

Department of Statistics Logo







The International Year of Statistics (Statistics2013)
PARTNERS
National Institute of Statistical Sciences Logo
Statistical and Applied Mathematical Sciences Institute Logo
Bioinformatics Research Center Logo
Center for Quantitative Sciences in Biomedicine Logo
Department of

Statistics

NCSU Dept of Statistics
5109 SAS Hall
2311 Stinson Drive
Raleigh, NC 27695-8203

Tel: (919) 515-2528
Fax: (919) 515-7591

Computation for Undergraduates in Statistics Program


2010 - 2011 Projects


Survival of the Fittest: Evolution Re-applied to Genetics

Background: As genotyping technologies advance, and the cost and quality of genotyping patients in clinical studies decreases exponentially, hundreds of thousands or millions of genetic variants per individual can now be considered in gene-mapping studies. This presents an important analytical and computational challenges, as new bioinformatics tools must have reasonable computation times in order to be accessible for such studies. In looking for complex predictive models (with gene-gene interactions, etc) that predict diseases, this computational challenge is exponentially compounded. Many of the successful methodologies designed for detecting complex models are highly successful for moderately scaled studies, but are computationally intractable for the most recent technologies. In response to these limitations, it has been proposed to use evolutionary computation to detect complex models in this large-scale data. Evolutionary computation is a machine learning approach modeled after Darwin's concept of selection, or "survival of the fittest" where computer programs evolve to find an optimal solution for a range of tasks. In human genetics, evolutionary computation has been applied to evolve decision tree models of disease risk, and showed promising initial success. This approach can detect complex models in relatively large data very efficiently. While the initial successes are promising, there are many parameters used in the implementation of the method that need to be investigated for their impact on the performance of the method. Additionally, the power of the method needs to be tested in a range of simulated genetic models, and the method validated in a real data application. Freely available software, implemented in Unix/Linux will be used to perform the simulations and data analysis will the evolutionary computation method. Students will use NCSU's super computing cluster, through the High Performance Computing (HPC) center for this cutting-edge type of computation.

Copyright 2011 NCSU Department of Statistics
Comments / Problems: webmaster@stat.ncsu.edu
Privacy Statement
NCSU Policies