Entry Date:
January 19, 2006

29 Mammals Project


Identification of the functional elements in the human genome — including both coding and non-coding -- is a key foundation for biomedical research. One of the most powerful ways to discover these elements is through cross-species comparisons with other mammalian genomes — in effect, deciphering evolution's laboratory notebook containing the results of 100 million years of evolution.

The mammalian genome project is a NIH-funded effort to expand the current genome coverage of the mammals (human, chimpanzee, mouse, dog, opposum) by sequencing 24 additional mammals to low-coverage (2x). The goal is to create low coverage genome assemblies and align resulting sequence to the human genome to permit comparative genomic analysis.

The Broad Institute is sequencing 15 mammals, while two other centers are sequencing the other 9 mammals. We are also developing algorithms to identify regions of sequence similarity across species, which have persisted through evolution and are indicative of genomic functionality. These regions include genes and smaller regulatory elements, such as transcription factor binding sites, which play key roles in determining the activation of genes and pathways in different cellular contexts.

The mammals receiving low coverage sequence were chosen primarily to maximize the total branch length of the evolutionary tree. Emphasis was also placed on organisms that represent the diversity of the mammalian tree and, where possible, are biologically useful models.

Though effective for use in identifying features of the human genome shared across most mammals, we recognize the inherent limitations associated with low coverage genome analyses. We will obtain higher quality sequence data (6-7X coverage) from a limited set (8 of 24) of mammals picked for low coverage which will significantly aid in the annotation and understanding of the human genome.