Skip to Main Content

Computer Science

A guide to help computer science folks at NYU get to the resources they need.

Citing Data

Data should be cited within our work for the same reasons journal articles are cited: to give credit where credit is due (original author/producer) and to help other researchers find the material. If you use data without citation, that is deeply problematic for academic integrity as well as reproducibility purposes. Pay attention to licenses (here's a page on those) and give attribution!

A data citation includes the typical components of other citations:

  • Title: the title of the dataset or a brief description of it if it's missing a title
  • Author or creator: the entity/entities responsible for creating the data
  • Date of publication: the date the data was published or otherwise released to the public
  • Publisher: entity responsible for hosting the data (like a repository or archive)
  • DOI (or URL if there is none): a persistent link that points to the data
  • Version or Date Downloaded: if the data has a version number, include that, but since most data are published without versions it's important to note the time that you last downloaded the data, in case newer releases are made over time.

Citation standards for data sets differ by journal, publisher, and conference, but you have a few options generally (depending on the situation):

  1. Use the format of a style manual as determined by a publisher or conference, such as IEEE or ACM. If you use a citation manager (highly recommended for organizing research reading!) like Zotero (which we support at NYU - check out our Zotero guide), you can have them export your citations in whatever format you need.
  2. Use the author or repository's preferred citation that they list on the page where you downloaded the data initially.

Here's an example of how to find the citation information for a dataset hosted on Zenodo, a generalist repository that houses data, code, and more:

Screenshot of the bottom of the 'MESA tool' Zenodo record, showing the citation information outlined with a red box.

Citing data tutorial (video 33 minutes, 18 seconds)