Skip to Main Content

Data Sources

An overview of data sources at NYU and beyond

Alternative Sources for Government Data

Recently, there have been significant changes to the availability of federal data. These data are heavily used by researchers and the general public; they also undergird policy and funding decisions made on behalf of people across the US and worldwide. We are following these developments closely and contributing to the many data rescue efforts by organizations and individuals across the research community

If you are having difficulty accessing data you use for your research and/or teaching, or are concerned about posting data you have created on federally-hosted websites please reach out to Data Services via our request form.

Alternative Sources for Government Data (General)

Internet Archive

The Internet Archive is a free, non-profit public digital library and archive that preserves and provides access to digitized books, webpages, images, audio, and video. Archived government information can be found in several places:

  • The Wayback Machine: The Wayback Machine contains primarily archived websites, videos (television broadcast videos and films), and audio recordings. 
  • US Government Documents: This government specific Internet Archive collection contains digital versions of United States Government documents, including agency documents, executive orders, and archived government webpages.
  • Democracy's Library: A project of the Internet Archive focused on Government Datasets, Documents, and Records (excluding webpages).

Independent Archives of Government Data

  • End of Term Web Archive: The End of Term Web Archive captures and saves U.S. Government websites at the end of presidential administrations.
  • DataLumos: DataLumos is an ICPSR (Inter-university Consortium for Political and Social Research) Archive for U.S. Government and other social science data. Users need to login to access. When prompted, click "sign in with Google" and use your NYU NetID credentials. 
  • IPUMS: IPUMS (International Program for Microdata Project) provides census and survey data from around the world.
  • Source Cooperative - Data.gov Mirror: A project of the Harvard Law School Library Innovation Lab, this is a regularly updated mirror of Data.gov, the US federal government data finding and storage site.

Alternative Sources for Science and Public Health Data

  • CDC Data on the Internet Archive: An archive of all CDC datasets uploaded to https://data.cdc.gov/browse before January 28th, 2025. Excludes corrupt datasets and data not publicly accessible.
  • Public Environmental Data Project: A volunteer coalition of several environmental, justice, and policy organizations, researchers across several universities, archivists, and students who rely on federal datasets and tools to support critical research, advocacy, policy, and litigation work.
  • Environmental Data and Governance Initiative (EDGI): The Environmental Data & Governance Initiative (EDGI) is a research collaborative and network of diverse professionals promoting evidence-based policy-making and public interest science that advances the Environmental Right to Know (ERTK). They have been archiving US federal environmental data.
  • The Climate Mirror Project: Lead by the Data Refuge Project of the University of Pennsylvania. “The Climate Mirror Project is trying to mirror and safely archive U.S. government websites and datasets related to climate, climate change, and global warming.

Alternative Sources for Law and Justice Data

  • Silencing Science tracker: A joint initiative of the Sabin Center for Climate Change Law and the Climate Change Legal Defense fund. Tracks government attempts to restrict or prohibit scientific research, education, or discussion.

Data Rescue Project

The Data Rescue Project is a coordinated effort among a group of data organizations, including IASSISTRDAP, and members of the Data Curation Network. Their goal is to serve as a clearinghouse for data rescue-related efforts and data access points for public US governmental data that are currently at risk.

The Data Rescue Project website includes useful resources for researchers and students. There are also instructions on how to contribute to the Data Rescue Project efforts.

  • Data Rescue Tracker: A collaborative tool built to catalog existing public data rescue efforts to coordinate better across initiatives. The Data Rescue Tracker aims to provide a consolidated overview of who is downloading which dataset from which government websites.

Checklists for Data Rescue:

  • DataRescue Workflow: This is the workflow from the original data rescue/DataRefuge project in 2017. Many of the tools are no longer working, but the workflow is still useful. Part of this effort is also housed in the Harvard Dataverse Repository and can be opened for more data deposits.