GitHub Repo Awesome Public Datasets

This article introduces a list of public dataset.

Awesome Public Datasets

This is a list of topic-centric public data sources in high quality. They are collected and tidied from blogs, answers, and user responses. Most of the data sets listed below are free, however, some are not. This project was incubated at OMNILab, Shanghai Jiao Tong University during Xiaming Chen's Ph.D. studies. OMNILab is now part of the BaiYuLan Open AI community.

Other amazingly awesome lists can be found in sindresorhus's awesome list.

Agriculture

Architecture

Biology

Chemistry

Climate+Weather

ComplexNetworks

ComputerNetworks

CyberSecurity

DataChallenges

EarthScience

Economics

Education

Energy

Entertainment

Finance

GIS

Government

Healthcare

ImageProcessing

MachineLearning

Museums

NaturalLanguage

Neuroscience

  • OK_ICON Allen Institute Datasets \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//Neuroscience/Allen-Institute-Datasets.yml)\]
  • OK_ICON Brain Catalogue \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//Neuroscience/Brain-Catalogue.yml)\]
  • FIXME_ICON Brainomics \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//Neuroscience/Brainomics.yml)\]
  • OK_ICON CodeNeuro Datasets \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//Neuroscience/CodeNeuro-Datasets.yml)\]
  • OK_ICON Collaborative Research in Computational Neuroscience (CRCNS) \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//Neuroscience/Collaborative-Research-in-Computational-Neuroscience-CRCNS.yml)\]
  • OK_ICON FCP-INDI \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//Neuroscience/FCP-INDI.yml)\]
  • OK_ICON Human Connectome Project \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//Neuroscience/Human-Connectome-Project.yml)\]
  • FIXME_ICON NDAR \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//Neuroscience/NDAR.yml)\]
  • FIXME_ICON NIMH Data Archive \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//Neuroscience/NIMH-Data-Archive.yml)\]
  • OK_ICON NeuroData \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//Neuroscience/NeuroData.yml)\]
  • OK_ICON NeuroMorpho - NeuroMorpho.Org is a centrally curated inventory of digitally reconstructed \[\...\] \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//Neuroscience/NeuroMorpho.yml)\]
  • OK_ICON Neuroelectro \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//Neuroscience/Neuroelectro.yml)\]
  • FIXME_ICON OASIS \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//Neuroscience/OASIS.yml)\]
  • OK_ICON OpenNEURO \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//Neuroscience/OpenNEURO)\]
  • OK_ICON OpenfMRI \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//Neuroscience/OpenfMRI.yml)\]
  • OK_ICON Study Forrest \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//Neuroscience/Study-Forrest.yml)\]
  • OK_ICON The Nencki-Symfonia EEG/ERP dataset - A high-density electroencephalography (EEG) dataset \[\...\] \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//Neuroscience/The_Nencki-Symfonia_EEG_ERP_dataset.yml)\]

Physics

ProstateCancer

Psychology+Cognition

PublicDomains

  • OK_ICON Ably Open Realtime Data \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Ably.yml)\]
  • OK_ICON Amazon \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Amazon.yml)\]
  • OK_ICON Archive.org Datasets \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Archive.org-Datasets.yml)\]
  • OK_ICON Archive-it from Internet Archive \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Archive.yml)\]
  • OK_ICON CMU JASA data archive \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/CMU-JASA-data-archive.yml)\]
  • OK_ICON CMU StatLab collections \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/CMU-StatLab-collections.yml)\]
  • OK_ICON Data.World \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Data.World.yml)\]
  • FIXME_ICON Data360 \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Data360.yml)\]
  • OK_ICON Enigma Public \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Enigma-Public.yml)\]
  • OK_ICON Google \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Google.yml)\]
  • OK_ICON Grand Comics Database - The Grand Comics Database (GCD) is a nonprofit, internet-based \[\...\] \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/GrandComics.yml)\]
  • OK_ICON Infochimps \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Infochimps.yml)\]
  • OK_ICON KDNuggets Data Collections \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/KDNuggets-Data-Collections.yml)\]
  • OK_ICON Microsoft Azure Data Market Free DataSets \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Microsoft-Azure-Data-Market-Free-DataSets.yml)\]
  • OK_ICON Microsoft Data Science for Research \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Microsoft-Data-Science-for-Research.yml)\]
  • OK_ICON Microsoft Research Open Data \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Microsoft-Research-Open-Data)\]
  • OK_ICON Open Library Data Dumps \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Open-Library-Data-Dumps.yml)\]
  • FIXME_ICON Reddit Datasets \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Reddit-Datasets.yml)\]
  • FIXME_ICON RevolutionAnalytics Collection \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/RevolutionAnalytics-Collection.yml)\]
  • OK_ICON Sample R data sets \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Sample-R-data-sets.yml)\]
  • OK_ICON Stack Overflow Annual Developer Survey - Annual developer surverys full data sets from 2011 \[\...\] \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Stack-Overflow-Annual-Developer-Survey.yml)\]
  • FIXME_ICON StatSci.org \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/StatSci.org.yml)\]
  • OK_ICON Stats4Stem R data sets (archived) \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Stats4Stem-R-data-sets.yml)\]
  • FIXME_ICON The Washington Post List \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/The-Washington-Post-List.yml)\]
  • FIXME_ICON UCLA SOCR data collection \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/UCLA-SOCR-data-collection.yml)\]
  • FIXME_ICON UFO Reports \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/UFO-Reports.yml)\]
  • OK_ICON Wikileaks 911 pager intercepts \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Wikileaks-911-pager-intercepts.yml)\]
  • FIXME_ICON Yahoo Webscope \[[Meta](https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Yahoo-Webscope.yml)\]

SearchEngines

SocialNetworks

SocialSciences

Software

Sports

TimeSeries

Transportation

eSports

Complementary Collections

comments powered by Disqus