National Human Genome Research Institute

From RP
Jump to: navigation, search

About the National Human Genome Research Institute (NHGRI)

NHGRILogo.jpg

The Human Genome Project (HGP) was one of the great feats of exploration in history - an inward voyage of discovery rather than an outward exploration of the planet or the cosmos; an international research effort to sequence and map all of the genes - together known as the genome - of members of our species, Homo sapiens. Completed in April 2003, the HGP gave us the ability to, for the first time, to read nature's complete genetic blueprint for building a human being.

Access the Data

The homepage for the NHGRI http://www.genome.gov/ can be browsed to find individual datasets. Some of the datasets require administrative approval before you can access them. A partial, but lengthy listing of individual datasets can be found in the next section.

Contents of the Data

There are a variety of databases that can be found on the website. Databases supported by the NHGRI:

Genome Informatics and Computational Biology Program

FlyBase - Drosophila Database

The Generic Model Organism Project (GMOD) - A Toolkit for Creating New Community Databases of Biology

Gene Ontology Consortium - Controlled Vocabularies for Gene Product Attributes

MGI - The Mouse Genome Database

Reactome - A Curated Resource of Core Pathways and Reactions in Human Biology

RGD - The Rat Genome Database

SGD - Saccharomyces Genome Database

UCSC Genome Bioinformatics - A Portal for Genomic Data

UniProt KnowledgeBase - A Portal for Curated Information of Protein Sequence, Classification and Function

WormBase - The C. elegans Genome Database

zfin.org - The Zebrafish Information Network

Genome Informatics and Computational Biology Program

HapMap Project - International HapMap Project

UniProt KnowledgeBase - - A Portal for Curated Information on Protein Sequence, Classification and Function

Online Mendelian Inheritiance in Man (OMIM) - A Catalog of Human Genes and Genetic Disorders

Gene Ontology Consortium - Controlled Vocabularies for Gene Product Attributes

Gene Ontology Annotation (GOA) - A database of gene ontology annotations for human proteins

Reactome - - A Curated Resource of Core Pathways and Reactions in Human Biology

HUGO Human Genome Nomenclature Committee (HGNC) - A Resource for Human Gene Names

GeneTests - A Medical Genetics Information Resource

PharmGKB - The Pharmacogenetics and Pharmacogenomics Knowledge Base

The ENCODE Project: ENCyclopedia Of DNA Elements

The data produced by ENCODE Consortium members are deposited to public databases and are available for all to use without restriction. Data linked to the genomic sequence is stored and visualized on the University of California, Santa Cruz browser:

ENCODE - ENCODE Data Coordination Center

Other, non-sequence based data, like that from microarray studies, are available on public databases such as the Gene Expression Omnibus (GEO) and ArrayExpress.

The Knock-Out Mouse Genome Project

The KOMP Data Coordination Center (DCC) Web site can be found at www.knockoutmouse.org. The KOMP Data Coordination Center is being established in close collaboration with all members of the KOMP Research Network.

The Cancer Genome Atlas (TCGA)

TCGA Data Portal provides a platform for researchers to search, download, and analyze data sets generated by TCGA. This portal contains all TCGA data pertaining to clinical information associated with cancer tumors and human subjects, genomic characterization, and high-throughput sequencing analysis of the tumor genomes.

http://tcga-data.nci.nih.gov/tcga/homepage.htm

A DNA Polymorphism Discovery Resource

The purpose of this resource is to facilitate the discovery of large numbers of sequence variants in human DNA. To that end, the resource includes samples representative of the genetic diversity found in the U.S. population. This resource is not intended to contain within it sufficient information to allow the study of how the variation relates to disease or other phenotypes. The variants found by use of this resource can be used subsequently by researchers to study how the variation relates to disease and health in projects aimed specifically at particular diseases or traits.

http://www.genome.gov/10001552

Many, many more...

National Center for Biotechnology Information (NCBI) Databases and Tools:

  • NCBI [ncbi.nlm.nih.gov]
  • EntrezGene [ncbi.nlm.nih.gov]

A searchable database of genes

  • NCBI Reference Sequence Project (RefSeq) [ncbi.nlm.nih.gov]
  • Macromolecular 3D Structures Database (MMDB) [ncbi.nlm.nih.gov]
  • GenBank [ncbi.nlm.nih.gov]
  • The Genome Database [gdb.org]

Nucleotide Sequence Databases

  • GenBank [ncbi.nlm.nih.gov]
  • EMBL Nucleotide Sequence Database [ebi.ac.uk]
  • DNA Data Bank of Japan [ddbj.nig.ac.jp]
  • Trace Archives (Raw Sequence Data Repositories)
  • Ensemble Trace Server [trace.ensembl.org]
  • NCBI Trace Archive [ncbi.nlm.nih.gov]

Single Nucleotide Polymorphisms (SNPs)

  • The SNP Consortium [snp.cshl.org]
  • NCBI dbSNP [ncbi.nlm.nih.gov]

cDNAs and Expressed Sequence Tags (ESTs)

  • Mammalian Gene Collection [mgc.nci.nih.gov]
  • NCBI dbEST [ncbi.nlm.nih.gov]
  • RIKEN Mouse Encyclopedia [genome.rtc.riken.go.jp]

Model Organism Databases

  • Berkeley Drosophila Genome Project [fruitfly.org]
  • Ciona savigny Database [genome.wi.mit.edu]
  • FlyBase, Drosophila Database [flybase.bio.indiana.edu]
  • MGD, The Mouse Genome Database [informatics.jax.org]
  • SGD, Saccharomyces Genome Database [stanford.edu]
  • Tetraodon nigroviridis Database [genome.wi.mit.edu]
  • WormBase, The C. elegans Genome Database [wormbase.org]
  • Gene Ontology Consortium [geneontology.org]
  • The Generic Model Organism Project (GMOD) [gmod.org]
  • TIGR Databases and Comprehensive Microbial Resource [tigr.org]
  • Phytopthora Genome Consortium Database [pfgd.org/]

Additional Sequence, Gene and Protein Databases

  • BLOCKS [blocks.fhcrc.org]

A service for biological sequence analysis.

  • Eukaryotic Promoter Database [epd.isb-sib.ch]
  • PROSITE [expasy.org]

A database of protein families and domains.

  • SWISS-PROT [expasy.org]

A protein knowledgebase.

  • BioMagResBank [bmrb.wisc.edu]

NMR spectroscopy data on proteins, peptides, and nucleic acids.

  • Protein Data Bank (PDB) [rcsb.org]

The repository for 3-D biological macromolecular structure data.

  • DSSP [cmbi.kun.nl]

A database of secondary structure protein assignments.

  • FSSP [biocenter.helsinki.fi]

A database of fold classifications based on structure-structure alignment of proteins.

  • HSSP [cmbi.kun.nl]

A database of homology-derived secondary structure of proteins.

  • Nucleic Acid Database Project (NDB) [ndbserver.rutgers.edu]

Structural information about nucleic acids.

  • The I.M.A.G.E. Consortium [image.llnl.gov]