For assistance, please submit a request.
You can also reach us via the chat below, email data.services@nyu.edu, or join Discord server.
If you've met with us before, tell us how we're doing.
Stay in touch by signing up for our Data Services newsletter.
Bobst Library, 5th floor
Mondays: 12pm - 5pm
Tuesdays: 12pm - 5pm
Wednesdays: 12pm - 5pm
Thursdays: 12pm - 5pm
Fridays: 12pm - 5pm
Ideally, file types for a project should be standard, non-proprietary, and open source. If these features are not possible, at the very least file format selection should be made with sustainability and long-term use in mind. Try opening a Windows 95 Word Document on your modern computer, and you'll understand why (hint: you will get only wingdings)!
Many software you will have to use often relies on proprietary file formats that do not last long as new versions are created, or tools lose relevance. Where possible, export data files to stable formats for long-term access to your data, or convert proprietary files into equivalent standardized files that will be able to represent that data (like going from .xlsx to .csv).
A proprietary format can refer to:
An open format is:
Text
XML (.xml)
HTML (.htm)
OpenDocument Format (e.g. OpenDocument Text, .odt)
Plain text (.txt)
Markdown and other human-readable markup languages deploying plain-text editing
Tabular
Character-delimited files such as Comma Separated Value (.csv) or Tab Delimited (.tab)
XML
Plain text (.txt)
Media
Uncompressed TIFF (.tif)
JPEG 2000 (.mj2)
MPEG-4 (.mp4)
Free Lossless Audio Codec (.flac)
Geospatial
ESRI Shapefiles and supporting files (.shp, .shx, .dbf, .prj, .sbx, .sbn)
KML (.kml)
GML (.gml)
GeoTIFF (.tif, .tfw)
Text
PDF/A
Statistical
SPSS portable format (.por)
R file formats, i.e. script files (.R) data (.Rda, .Rdata) or markdown files (.Rmd)
Stata file formats, i.e. do-files (.do) and data files (.dta)
SAS file formats (.sas, .xpt, etc.)
Media
JPEG (.jpeg, .jpg)
MP3 (.mp3)
Photoshop files (.psd)
Geospatial
ESRI geodatabase formats (.gdb, .mdb)
Where possible given the limits of file formatting, encoding should be done using the Unicode system (UTF-8 or UTF-16), or using the older ASCII system that has been incorporated into Unicode.