Skip to Main Content

Research Software: Designing for Publication and Reproducibility

How to prepare scholarly code for submission to journals or repositories.

Setting Up Your Project

Adding a few helpful files to your repository--whether it's on GitHub, GitLab, BitBucket, or just a zip file you want to share with a collaborator--will make your code much easier to use and understand. It's helpful to ask yourself "what would someone who can't get in touch with me need to know to make this program run?" The files described below provide that information in a structured way. 

Some programming languages have templates for setting up your repository with these files. 

For general help with file structure and organization and for more specific information about organizing data (as opposed to code) in a repository, please see the Research Data Management guide.

A 1970s photograph shows a white woman with a ponytail standing in front of an open file cabinet drawer, which is packed with files.

Image credit: Manuscripts and Archives Division, The New York Public Library. (1972). Ellen Ratner at filing cabinet. Retrieved from New York Public Library Digital Collections.

 

Resources

README

Always include a README with your code. A few essential pieces of information to include in the README:

  • The code's purpose
  • Installation instructions for the code and the dependencies it requires. Always include the exact version of the dependencies (e.g. "Python 3.7 or later," "TensorFlow 2.1.1." See the Dependencies page of this guide for more information on documenting dependencies). Indicate if your software can be installed through a package manager or distribution like CRAN, PyPi, or Conda
  • How to run the code: the order of operations for running different pieces of the program, expected inputs and outputs
  • Troubleshooting notes
  • If the code is written to run on a specific dataset, and the data is not packaged with the code, provide link to the code's location; likewise, if the code is related to an academic paper you have published, link to that as well
    • If your dataset is proprietary or cannot be shared due to privacy concerns, it's helpful to package some sample data with the software so another user can make sure it works properly.
  • Add a citation for how you would like others to cite your software

Image of the ReadMe from a software called Spatial frequency preferences, including links to a DOI and to open in Binder, and the text "An fMRI experiment to determine the relationship between spatial frequency and eccentricity in the human early visual cortex.  See the paper for scientific details. If you re-use some component of this project in an academic publication, see the citing section for how to credit us.  Table of Contents Usage Notebooks Model parameters Model exploration widget"

README from Spatial Frequency Preferences, research software created by NYU Center for Neural Sciences graduate Dr. Billy Broderick

Citation File

A citation file, CITATION.CFF or CODEMETA.JSON  is a way to store machine-readable metadata for research software, allowing it to be cited accurately wherever it appears on the web. The two formats are relatively similar and work for display on GitHub, Zenodo, and elsewhere. There is also a tool for converting a CFF file to CodeMeta. CodeMeta even has a simple form on their GitHub site which will generate a well-formatted codemeta.json file for inclusion in a repository. Since one of the primary goals of publishing research is to ensure you receive credit for your work, it's advisable to include one of these files, which can be easily added to a bibliography manager and harvested by metadata aggregators and other search tools. 

 

Screenshot of a codemeta.json file taken from the Codemeta github. A link to the file is in the caption below.

 

Codemeta.json example file from CodeMeta Github

 

Resources

Contributing and Code of Conduct Files

Anyone can use software on GitHub, GitLab, or BitBucket according to the terms of its license. If you anticipate that your software will be used heavily, a contributing file is useful to define the kinds of contributions you welcome; how best to make suggestions, bug reports, pull requests; how to ask for support using the software; and how best to get in touch with you. These instructions are typically included in a file called CONTRIBUTING.md.

You can also include a code of conduct file called CODE_OF_CONDUCT.MD  to set the terms of how people can communicate with you and with each other on the project's pages. 
 

Screenshot of the beginning of the Contributing file from The Carpentries github repo. A link to this file is in the caption below this image.

Contributing.MD from The Carpentries.

 

Resources