COVID-19 Service Status
Data Services has shifted to virtual services for the Summer 2020 sessions. During our normal working hours, we will respond to requests
via e-mail and hold consultations
via Zoom when necessary.
Staffed Hours: Summer 2020
Mondays: 12pm - 6pm
Tuesdays: 12pm - 6pm
Wednesdays: 12pm - 6pm
Thursdays: 12pm - 6pm
Fridays: 12pm - 4pm
If you've met with us before, tell us how we're doing.
A simple and effective way to make your data accessible is to store it in a repository. A repository is a storage facility (sometimes also a storage and curation facility) where users can upload and download their data, make it accessible and discoverable, all in an effort to fulfill grant requirements and/or support the free sharing of scientific knowledge.
There are some things to keep in mind when selecting a repository. Data in a repository should be:
Persistent (not likely to be modified)
Searchable and browsable
Retrieved or downloaded easily
A wide variety of institution-based and discipline-specific repositories exist for digital data. The repository itself should be:
If both a discipline-specific repository and an institution-based one exist for your data, then consider depositing in both locations to maximize discovery and safety of the data.
Many more data repositories are available online than can be listed here. Consult re3data.org, an external resource, for an extensive list of discipline-specific repositories.
Other tools to help you find a repository:
About: A new tool recently launched by DataCite for helping people identify and locate online repositories of research data. Draws from the re3data listings for repository information.
Open Access Directory's Data Repositories Wiki
About: A list of repositories and databases for locating and depositing open data.
Dataverse Network Project
About: The Dataverse Network is an application to publish, share, reference, extract and analyze research data. It facilitates making data available to others, and allows to replicate others work. Researchers and data authors get credit, publishers and distributors get credit, affiliated institutions get credit.
How to archive your data: Create your own Dataverse at Harvard's IQSS here. Once you have created a Dataverse, you are free to upload, describe, and share datasets on your own.
NYU Faculty Digital Archive (FDA)
About: The Faculty Digital Archive is a place where full-time NYU faculty can deposit their work in digital form. FDA collections can be shared with the world, or restricted to selected people. The FDA is intended to be a highly visible repository of NYU faculty digital scholarship.
How to archive your data: For more information on the FDA or to request space on the FDA for your materials, please e-mail firstname.lastname@example.org.
Inter-University Consortium for Political and Social Research (ICPSR)
About: An international consortium of about 700 academic institutions and research organizations, ICPSR provides leadership and training in data access, curation, and methods of analysis for the social science research community. ICPSR maintains a data archive of more than 500,000 files of research in the social sciences.
How to archive your data: Fill out ICSPR's Data Deposit Form here. ICPSR's archivists will review your submission and work with you to make it ready for release.
Qualitative Data Repository.
About: QDR is a dedicated repository for preserving and sharing the digital assets associated with social science and mixed methods projects. It was founded with support from the National Science Foundation and the Center for Qualitative and Multi-Method Inquiry, a unit of the Maxwell School of Citizenship and Public Affairs at Syracuse University.
The Cell: An Image Library
About: Images of all cell types from all organisms, including intracellular structures and movies or animations demonstrating functions. This project relies upon the cell biology community to populate the library. Freely accessible, easy-to-search, public repository of reviewed and annotated images, videos, and animations of cells from a variety of organisms, showcasing cell architecture, intracellular functionalities, and both normal and abnormal processes.
About: The NIH genetic sequence database, an annotated collection of all publicly available DNA sequences (Nucleic Acids Research, 2013 Jan;41(D1):D36-42). GenBank is part of the International Nucleotide Sequence Database Collaboration , which comprises the DNA DataBank of Japan (DDBJ), the European Molecular Biology Laboratory (EMBL), and GenBank at NCBI.
Global Biodiversity Information Facility (GBIF)
About: "Free and open access to biodiversity data." Launched in 2007 by institutions in 17 countries under a non-binding inter-governmental agreement.
About: Holds biological Imaging documents a wide variety of research including: specimen-based research in comparative anatomy, morphological phylogenetics, taxonomy and related fields focused on increasing our knowledge about biodiversity. The project receives its main funding from the Biological Databases and Informatics program of the National Science Foundation (Grant DBI-0446224).
National Biological Information Infrastructure
About: A broad, collaborative program to provide increased access to data and information on the nation's biological resources. The NBII links diverse, high-quality biological databases, information products, and analytical tools maintained by NBII partners and other contributors in government agencies, academic institutions, non-government organizations, and private industry. (Note: In the President's budget for Fiscal Year 2012 the repository was terminated.)
About: "We are bringing together taxonomic and distributional information about the entire fossil record of plants and animals." From a large number of researchers at a large number of institutions.
Cambridge Structural Database
About: The CCDC is a non-profit, charitable Institution whose objectives are the general advancement and promotion of the science of chemistry and crystallography for the public benefit.
Crystallography Open Database
About: A joint project of the Mineralogical Society of America, Mineralogical Association of Canada, European Journal of Mineralogy,International Union of Crystallography, and the US National Science Foundation. Data are in the public domain.
Open Notebook Science Solubility Challenge
About:Maintained by Jean-Claude Bradley, Rajarshi Guha, Andrew Lang and Cameron Neylon. A database of non-aqueous solubility measurements with links to lab notebook pages where experiments were recorded. The database can be searched via Web Query or alternate means.
About: "A free database of commercially-available compounds for virtual screening." From the Shoichet Laboratory in the Department of Pharmaceutical Chemistry at the University of California, San Francisco.
About: Keeps your public and private code available, secure, and backed up.
About: 2.7 million developers create powerful software in over 260,000 projects. Our popular directory connects more than 46 million consumers with these open source projects and serves more than 2,000,000 downloads a day. SourceForge is where open source happens.
About: Stanford Large Network Dataset Collection. The SNAP library is being actively developed since 2004 and is organically growing as a result of our research pursuits in analysis of large social and information networks. Largest network we analyzed so far using the library was the Microsoft Instant Messenger network from 2006 with 240 million nodes and 1.3 billion edges.
OpenEI: Open Energy Information
About: Freely-available energy data, tools, models, and other resources.
Climate Change Data Portal
About: From the Environment Department of the World Bank.
The Marine Geoscience Data System (MGDS)
About: The Marine Geoscience Data System (MGDS) provides access to data portals for the NSF-supported Ridge 2000 and MARGINS programs, the Antarctic and Southern Ocean Data Synthesis, the Global Multi-Resolution Topography Synthesis, and Seismic Reflection Field Data Portal.
IRIS (Incorporated Research Institutions for Seismology).
About: From 100+ US universities and the National Science Foundation.
Geosciences & Geospatial Data
About: Holds data systems and services for geochemical, geochronological, and petrological data, developed and maintained by EarthChem, including the EarthChem Library, the EarthChem Portal, PetDB, NAVDAT, SedDB, and Geochron. EarthChem is operated by a joint team of disciplinary scientists, data scientists, data managers and information technology developers who are part of the NSF-funded data facility Integrated Earth Data Applications (IEDA).
The Geosciences Network (GEON)
About: project is a collaboration among a dozen PI institutions and a number of other partner projects, institutions, and agencies to develop cyberinfrastructure in support of an environment for integrative geoscience research. GEON is funded by the NSF Information Technology Research (ITR) program.
National Geographic Data Center
About: Archive of national and international marine environmental and ecosystem datasets.
The National Space Science Data Center
About: serves as the permanent archive for NASA space science mission data. "Space science" means astronomy and astrophysics, solar and space plasma physics, and planetary and lunar science. As permanent archive, NSSDC teams with NASA's discipline-specific space science "active archives" which provide access to data to researchers and, in some cases, to the general public.
MIRAGE (Middlesex medical Image Repository with a CBIR ArchivinG Environment).
About: From JISC and Middlesex University.
National Center for Biotechnology Information (NCBI)
About: The National Center for Biotechnology Information advances science and health by providing access to biomedical and genomic information.
Virginia Henderson Global Nursing e-Repository
About: Nursing research data.
CERN Scientific Information
About: Online particle physics data and information
Nist Atomic Spectra Database
About: The Atomic Spectra Database (ASD) contains data for radiative transitions and energy levels in atoms and atomic ions. Data are included for observed transitions of 99 elements and energy levels of 56 elements.
About: An international federation of data repositories containing earth observations data, including data from fields such as ecology, biology, evolution, and environmental sciences such as hydrology, oceanography, and atmospheric science. DataONE is a federation with participation from hundreds of field stations, universities, and government agencies through the DataONE Member Nodes.
About: An international repository of data underlying scientific and medical publications, particularly data for which no specialized repository exists. All material in Dryad is associated with a scholarly publication. Most data in the repository are associated with peer-reviewed articles, although data associated with non-peer reviewed publications from reputable academic sources, such as dissertations, are also accepted. Dryad is a non-profit organization.
About: A scientific publishing as it stands is an inefficient way to do science on a global scale. FigShare allows you to share all of your data, negative results and unpublished figures.
About: The Knowledge Network for Biocomplexity (KNB) is an international data repository containing ecology, biology, and environmental science data with a global distribution. The KNB is a grass-roots partnership of collaborating feld stations, laboratories, and research networks that openly publish and share data. The KNB is a Member Node within the DataONE data federation.
About: stands for "Publishing Network for Geoscientific & Environmental Data". Open to deposits from any scientist. Most datasets are open; some are restricted. Hosted by the Alfred Wegener Institute for Polar and Marine Research and the University of Bremen's Center for Marine Environmental Sciences.