Skip to main content


Downloading Catalist Data

The Catalist one percent data files are large, some of them over 3 GB in their zipped format (almost 4 million rows and over 200 variables per row). Here's a few things to keep in mind when downloading the data.

  • The files are archived into zip package. Some Mac users have had difficulty decompressing the files, as the .zip file is extracted into a .cpgz archive, which still cannot be opened. There are several solutions to this problem, including unzipping the files at the command line or using a third-party compression tool like The Unarchiver. Users who unzip the files on Windows have had more success.
  • Because of its size, the data needs to be opened with an advanced statistical analysis package, like Stata or SPSS. Simply opening in Excel won't work. Data services supports these packages and can help you out if you get stuck. You may also contact High Performance Computing for advanced processing needs.
  • The larger files can take up to 30 minutes to open in one of these programs, depending on the performance of your computer.
  • The delimiter for the one percent sample files is the tab.