Carlos D. Bustamante Lab
 
Bustamante Lab Template
Grant Number ~ NSF0516310
Title: Computational Methods for Detecting Natural Selection using Comparative Population Genomic Data
PI:Carlos Bustamante
Support: current
Source: NSF DEB
Location: Cornell University
Duration: 09/01/05- 08/31/09
Summary: In the near future, dozens of mammalian and Drosophila genomes will be assembled and publicly available. Coupled with large sequencing and genotyping projects already underway to document genomic variation within species, the next few years provide an unprecedented opportunity to study the forces that shape genome evolution.

The intellectual merit of this proposal is the production of broadly applicable population ge- netic methods for identifying genes and genomic regions that are involved in adaptive molecular evolution through the comparison of within and between species genomic variation. We propose to develop methods that serve four important purposes: (1) Classify loci (and domains within loci) into those that evolve neutrally, those that show excess amino acid or functional non-coding varia- tion within species, and those that show excess amino acid (or functional non-coding) differences among species, (2) Partition the relative contributions of mutation bias, protein structure / domain location, and physico-chemical properties of amino acids to evolutionary exchangeability for all pairs of amino acids, (3) Use the genomic distribution of Single Nucleotide Polymorphism (SNP) frequencies to differentiate between selective and demographic hypotheses for the evolutionary history of a given population, (4) Estimate the genomic distribution of selective effects on non- lethal mutations for different functional categories of mutations. For all of these tasks we will apply advances in computational statistics and numerical analysis to create powerful, robust, and broadly applicable population genetic methods. Extensive coalescent and forward simulations with selection, recombination, multiple mutations at the same nucleotide sites, and context-dependent mutation rate variation among sites will be used to test the power, robustness, and accuracy of our methods. We will also compare the power of our approaches to those of existing methods. The proposed methods will be applied to publicly available genomic data from human, chimpanzee, dog, mouse, rat, and Drosophila species to identify species-specific changes that are likely to be involved in molecular adaptation.

The broader impact of the proposed work is the production and/or refinement of two computer packages (mkprf and prfreq) for analyzing within- and between-species genomic variation data. The programs and source code will be distributed free of charge and web servers will be developed for those wishing to use our computational resources to run analyses. We will also im- plement design features aimed at making our tools accessible to the broader evolutionary genomics community. Application of our methods to comparative genomic data, will ultimately result in the identification of genes and genomic regions that are implicated in primate, murine, canine, and Drosophila adaptive evolution. Our hope that these genes can be prioritized for further molecular genetic study. The project will also create opportunities for underrepresented minority students to become involved with genomics research through participation in the NSF-AGEP funded Central New York to Puerto Rico-Mayaguez (CNY-PR) Alliance for Graduate Education.

 
Untitled Document
© Carlos D. Bustamante Lab. All Rights Reserved.