Data
FAIR, CARE, & TRUST Principles
The FAIR acronym and principles were defined in a March 2016 paper in the journal Scientific Data by a consortium of scientists and organizations
Wilkinson et al. (2016) established the guidelines to improve the Findability, Accessibility, Interoperability, and Reuse (FAIR) of digital assets for research.
- Go-FAIR website - official FAIR principles implementation guidance
Carroll et al. (2020) established the CARE Principles for Indigenous Data Governance.
- full document - one-page CARE principles summary
- US Indigenous Data Sovereignty Network - advancing Indigenous data governance and sovereignty
Lin et al. (2020) created the TRUST pinciples for digital repositories.
- TRUST Principles - digital repository transparency and sustainability principles
Data Use Agreements¶
MIT model data use agreements - templates for administrative data sharing agreements
Databrary Access Agreement - permits restricted access for data use and contribution by institutionally-authorized researchers who join Databrary
Regulations¶
European General Data Protection (GDPR) - EU data privacy and protection regulation
US HIPAA - health information privacy and security standards
License Types¶
Creative Commons Licenses - open content licenses for creative works
Open Datacommons - open licenses specifically for data and databases
Open Source Initiative Licenses - approved open source software licenses
Accessibility¶
W3 ARIA Suite - Accessible Rich Internet Applications (ARIA)
Web Accessibility Evaluation Tools List - comprehensive list of web accessibility testing tools
WAI-ARIA - Web Accessibility Initiative ARIA standards and guidelines
Mozilla ARIA - developer documentation for ARIA implementation
NVDA - screen reading software
JAWS - screen reading software
Data Repositories¶
Nature Scientific Data Repositories - Recommended - metalist of data repositories recommended for Nature science articles
Databrary – NSF and NIH-funded repository specialized for storing and sharing video data and other identifiable data from human subjects research. Based at New York University.
Dryad – started as US NSF project, used for data sets of any type that correspond to a research paper.
EBRAINS - European brain research infrastructure and data platform
EUDAT - network of European research organizations for data services
European Data Portal - central access point for European open data
Figshare - repository for research outputs in any format
GitHub - version control and code repository platform
G-Node - German Neuroinformatics Node
Harvard Dataverse – offers free storage of research data, owned by Harvard
Merritt - an open-source digital preservation repository maintained by the University of California Curation Center (UC3) at the California Digital Library (CDL).
Mendeley Data - free research data repository by Elsevier
NITRC - Neuroimaging Informatics Tools and Resources Collaboratory
NeuroLibre - preprints for computational neuroscience with executable notebooks
Open Access Directory - curated list of open data repositories
OpenNeuro - free platform for neuroimaging data sharing
Open Science Framework - free platform for research workflow and collaboration
PLOS Recommended Repositories - discipline-specific repositories recommended by PLOS journals
PrePubMed - search engine for biomedical preprints and publications
Registry of Research Data Repositories - global registry of research data repositories
The Winnower (defunct) - platform is no longer active
Zenodo – EU-funded project by OpenAire, it is hosted by CERN. Useful for EU-funded projects because reports back to the EU research Participant Portal.
Protocols and Bench Techniques¶
BioProtocol - peer-reviewed protocol journal for biological research
Current Protocols - step-by-step laboratory protocols across life sciences
Gold Biotechnology Protocol list - collection of molecular biology protocols and techniques
JoVE - Journal of Visualized Experiments with video protocols
Nature Protocols - peer-reviewed journal for research protocols
OpenWetWare - wiki for sharing biology protocols and methods
Protocol Exchange (archived, now part of protocols.io) - protocol sharing platform merged with protocols.io
Protocols Online - free collection of life science protocols
Protocols.io - largest open repository of scientific protocols
SciGene - laboratory equipment manufacturer for cytogenetics
Springer Nature Experiments - platform for publishing and discovering research methods
Software & Code Repositories¶
GitHub - most widely used version control and code hosting platform
GitLab - DevOps platform with Git repository management
SourceForge - hosting for open source software projects
Apache Subversion - centralized version control system
Agricultural Data Repositories¶
Berkeley Library Agricultural and Resource Economics - guide to finding agricultural and economics data
Data.World Agriculture Datasets - curated agricultural data collections
Farmers.gov Data - USDA data portal for farmers and researchers
USDA Ag Data Commons - US Department of Agriculture central data repository
USDA Forest Service Data Clearinghouse - geospatial forestry and natural resource data
USDA Forest Service ArcGIS Data - Forest Service ArcGIS Hub with maps
World Bank Agriculture Data - global agricultural indicators and statistics
Biological Data Repositories¶
European Molecular Biology Laboratory (EMBL) - European Bioinformatics Institute with genomics databases
Global Biodiversity Information Facility (GBIF) - 1.6B+ species occurrence records worldwide
National Center for Biotechnology Information (NCBI) - US genomics and biomedical literature databases
NIMH Genetics - NIH mental health genetics research repository
Public Data on Commercial Cloud¶
Earth on AWS - The Registry of Open Data on AWS helps you discover and share datasets that are available via AWS resources.
Google Earth Engine - petabyte-scale planetary geospatial analysis platform
Google Earth Engine Community Catalog¶
Awesome GEE Community Catalog - community-driven collection of 500+ geospatial datasets for Google Earth Engine, curated by Samapriya Roy. Includes atmospheric data, elevation models, land use/land cover, climate data, and specialized geoscience datasets not available in the official GEE catalog.
Microsoft AI for Earth Datasets - environmental and sustainability datasets on Azure
Microsoft Planetary Computer - multi-petabyte Earth observation data catalog with analytics
Torrents¶
Academic Torrents - BitTorrent repository for research datasets and papers
US Government Data¶
GeoPlatform.gov - federal geospatial data and mapping platform
US Government Public Data - central repository for US federal open data
EPA Data - Environmental Protection Agency datasets and tools
NASA Data - NASA's open data portal for space missions
USGS Data - US Geological Survey science data catalog
EU Government Data¶
European Space Agency Data - ESA Earth observation missions and data
Copernicus Climate Data Store - European climate reanalysis and projection datasets
Ecological Research Data Repositories¶
eBird - provides open data access in several formats to logged-in users, ranging from raw data to processed datasets geared toward more rigorous scientific modeling.
Environmental Data Initiative - repository for Long Term Ecological Research (LTER) network as well as community contributed datasets
FLUXNET - global network of eddy covariance flux towers
Long Term Agricultural Research (LTAR) - USDA long-term agroecosystem research data
National Ecological Observatory Network (NEON) - continental-scale ecological observation data from 81 sites
NCAR/UCAR Data Archive - atmospheric and climate research data archive
US National Phenology Network - plant and animal phenology observations
Earth Science Research Data Repositories¶
Multi-Mission Algorithm and Analysis Platform (MAAP) - NASA/ESA platform for biomass and forest data
NOAA Data - ocean, atmosphere, and climate datasets
- NOAA Climate Data - historical climate observations and weather records
UK Met Office Data - UK weather and climate research data
Open Data Cube - open-source platform for satellite Earth observation analysis
Africa Regional Data Cube - satellite data infrastructure for Africa
Australia Open Data Cube - Australian government's Earth observation data platform
Brazil Data Cube - Brazilian satellite data infrastructure and analysis
Public Data Sets¶
OpenStreetMap (OSM) - collaborative global map created by volunteers
OSM US Forest Service Data - Forest Service datasets available in OpenStreetMap
Standards¶
W3C (World Wide Web Consortium) - international web standards organization
EDM Council - enterprise data management standards and ontologies
Additional Resources¶
Figshare - repository for research outputs in any format
Nature Data Repository Guidance - recommended repositories for Nature publications
Registry of Research Data Repositories (re3data) - global registry of 3000+ research data repositories
PLOS Open Data - PLOS policies and guidance for data