Data Sources
One of the more challenging aspects of working with genomics data can be finding real data to get your hands on. This page serves as a directory of places where one can access both open access and protected access (requires an application) data. Please be aware that many sites offer both open access and controlled access data, so they will show up in both lists in that case.
caution
This list is by no means complete! New data sources are being added all of the time, and we have only added a few data sources to start this page off. Please make a pull request with data sources you are aware of so we can incorporate them.
Open Access Data
Data Source Name | Types of Data | Description | Data Source URL |
---|---|---|---|
1000 Genomes Project | WGS, Variant | One of the first large-scale, international sequencing efforts running from 2008 to 2015. You can learn more at the 1000genomes.org website. | Main Page, FTP |
Kids First Data Resource Commons | Expression Counts | Learn more. | Link |
St. Jude Cloud | Expression Counts | St. Jude Cloud hosts pediatric disease and cancer survivorship data for over 17,500 subjects. Learn more by reading the publication associated with St. Jude Cloud. | Link |
Refine.bio. | RNA-Seq and others | A search engine for normalized transcriptome data, created by Alex's Lemonade Stand Foundation | Link |
Protected Access Data
Data Source Name | Types of Data | Description | Data Source URL |
---|---|---|---|
International Cancer Genome Consortium | WGS, WES, RNA-Seq, Variant | Learn more. | Link |
Kids First Data Resource Commons | WGS, WES, RNA-Seq, Variant | Learn more. | Link |
St. Jude Cloud | WGS, WES, RNA-Seq, Variant | St. Jude Cloud hosts pediatric disease and cancer survivorship data for over 17,500 subjects. Learn more by reading the publication associated with St. Jude Cloud. | Link |