Skip to content

Data

FAIR, CARE, & TRUST Principles

The FAIR acronym and principles were defined in a March 2016 paper in the journal Scientific Data by a consortium of scientists and organizations

Wilkinson et al. (2016) established the guidelines to improve the Findability, Accessibility, Interoperability, and Reuse (FAIR) of digital assets for research.

Carroll et al. (2020) established the CARE Principles for Indigenous Data Governance.

Lin et al. (2020) created the TRUST pinciples for digital repositories.

  • TRUST Principles - digital repository transparency and sustainability principles

Data Use Agreements

MIT model data use agreements - templates for administrative data sharing agreements

Databrary Access Agreement - permits restricted access for data use and contribution by institutionally-authorized researchers who join Databrary

Regulations

European General Data Protection (GDPR) - EU data privacy and protection regulation

US HIPAA - health information privacy and security standards

License Types

Creative Commons Licenses - open content licenses for creative works

Open Datacommons - open licenses specifically for data and databases

Open Source Initiative Licenses - approved open source software licenses

Accessibility

W3 ARIA Suite - Accessible Rich Internet Applications (ARIA)

Web Accessibility Evaluation Tools List - comprehensive list of web accessibility testing tools

WAI-ARIA - Web Accessibility Initiative ARIA standards and guidelines

Mozilla ARIA - developer documentation for ARIA implementation

NVDA - screen reading software

JAWS - screen reading software

Data Repositories

Nature Scientific Data Repositories - Recommended - metalist of data repositories recommended for Nature science articles

Databrary – NSF and NIH-funded repository specialized for storing and sharing video data and other identifiable data from human subjects research. Based at New York University.

Dryad – started as US NSF project, used for data sets of any type that correspond to a research paper.

EBRAINS - European brain research infrastructure and data platform

EUDAT - network of European research organizations for data services

European Data Portal - central access point for European open data

Figshare - repository for research outputs in any format

GitHub - version control and code repository platform

G-Node - German Neuroinformatics Node

Harvard Dataverse – offers free storage of research data, owned by Harvard

Merritt - an open-source digital preservation repository maintained by the University of California Curation Center (UC3) at the California Digital Library (CDL).

Mendeley Data - free research data repository by Elsevier

NITRC - Neuroimaging Informatics Tools and Resources Collaboratory

NeuroLibre - preprints for computational neuroscience with executable notebooks

Open Access Directory - curated list of open data repositories

OpenNeuro - free platform for neuroimaging data sharing

Open Science Framework - free platform for research workflow and collaboration

PLOS Recommended Repositories - discipline-specific repositories recommended by PLOS journals

PrePubMed - search engine for biomedical preprints and publications

Registry of Research Data Repositories - global registry of research data repositories

The Winnower (defunct) - platform is no longer active

Zenodo – EU-funded project by OpenAire, it is hosted by CERN. Useful for EU-funded projects because reports back to the EU research Participant Portal.

Protocols and Bench Techniques

BioProtocol - peer-reviewed protocol journal for biological research

Current Protocols - step-by-step laboratory protocols across life sciences

Gold Biotechnology Protocol list - collection of molecular biology protocols and techniques

JoVE - Journal of Visualized Experiments with video protocols

Nature Protocols - peer-reviewed journal for research protocols

OpenWetWare - wiki for sharing biology protocols and methods

Protocol Exchange (archived, now part of protocols.io) - protocol sharing platform merged with protocols.io

Protocols Online - free collection of life science protocols

Protocols.io - largest open repository of scientific protocols

SciGene - laboratory equipment manufacturer for cytogenetics

Springer Nature Experiments - platform for publishing and discovering research methods

Software & Code Repositories

GitHub - most widely used version control and code hosting platform

GitLab - DevOps platform with Git repository management

SourceForge - hosting for open source software projects

Apache Subversion - centralized version control system

Agricultural Data Repositories

Berkeley Library Agricultural and Resource Economics - guide to finding agricultural and economics data

Data.World Agriculture Datasets - curated agricultural data collections

Farmers.gov Data - USDA data portal for farmers and researchers

USDA Ag Data Commons - US Department of Agriculture central data repository

USDA Forest Service Data Clearinghouse - geospatial forestry and natural resource data

USDA Forest Service ArcGIS Data - Forest Service ArcGIS Hub with maps

World Bank Agriculture Data - global agricultural indicators and statistics

Biological Data Repositories

European Molecular Biology Laboratory (EMBL) - European Bioinformatics Institute with genomics databases

Global Biodiversity Information Facility (GBIF) - 1.6B+ species occurrence records worldwide

National Center for Biotechnology Information (NCBI) - US genomics and biomedical literature databases

NIMH Genetics - NIH mental health genetics research repository

Public Data on Commercial Cloud

Earth on AWS - The Registry of Open Data on AWS helps you discover and share datasets that are available via AWS resources.

Google Earth Engine - petabyte-scale planetary geospatial analysis platform

Google Earth Engine Community Catalog

Awesome GEE Community Catalog - community-driven collection of 500+ geospatial datasets for Google Earth Engine, curated by Samapriya Roy. Includes atmospheric data, elevation models, land use/land cover, climate data, and specialized geoscience datasets not available in the official GEE catalog.

Microsoft AI for Earth Datasets - environmental and sustainability datasets on Azure

Microsoft Planetary Computer - multi-petabyte Earth observation data catalog with analytics

Torrents

Academic Torrents - BitTorrent repository for research datasets and papers

US Government Data

GeoPlatform.gov - federal geospatial data and mapping platform

US Government Public Data - central repository for US federal open data

EPA Data - Environmental Protection Agency datasets and tools

NASA Data - NASA's open data portal for space missions

USGS Data - US Geological Survey science data catalog

EU Government Data

European Space Agency Data - ESA Earth observation missions and data

Copernicus Climate Data Store - European climate reanalysis and projection datasets

Ecological Research Data Repositories

eBird - provides open data access in several formats to logged-in users, ranging from raw data to processed datasets geared toward more rigorous scientific modeling.

Environmental Data Initiative - repository for Long Term Ecological Research (LTER) network as well as community contributed datasets

FLUXNET - global network of eddy covariance flux towers

Long Term Agricultural Research (LTAR) - USDA long-term agroecosystem research data

National Ecological Observatory Network (NEON) - continental-scale ecological observation data from 81 sites

NCAR/UCAR Data Archive - atmospheric and climate research data archive

US National Phenology Network - plant and animal phenology observations

Earth Science Research Data Repositories

Multi-Mission Algorithm and Analysis Platform (MAAP) - NASA/ESA platform for biomass and forest data

NOAA Data - ocean, atmosphere, and climate datasets

UK Met Office Data - UK weather and climate research data

Open Data Cube - open-source platform for satellite Earth observation analysis

Africa Regional Data Cube - satellite data infrastructure for Africa

Australia Open Data Cube - Australian government's Earth observation data platform

Brazil Data Cube - Brazilian satellite data infrastructure and analysis

Public Data Sets

OpenStreetMap (OSM) - collaborative global map created by volunteers

OSM US Forest Service Data - Forest Service datasets available in OpenStreetMap

Standards

W3C (World Wide Web Consortium) - international web standards organization

EDM Council - enterprise data management standards and ontologies

Additional Resources

Figshare - repository for research outputs in any format

Nature Data Repository Guidance - recommended repositories for Nature publications

Registry of Research Data Repositories (re3data) - global registry of 3000+ research data repositories

PLOS Open Data - PLOS policies and guidance for data