Skip to Main Content

Data Services Class Descriptions

Information, materials, and schedules for all currently offered Data Services classes.
This session is an intermediate-to-advanced level class that offers some ideas for how to approach common data wrangling needs in research. This courses focuses on obtaining data and loading it into a suitable data "container" for analysis, often via a web interface, especially an API; parsing data retrieved via an API and turning it into a useful object for manipulation and analysis; performing some basic summary counts of records in a dataset and work up quick visualizations.
Software: Computer workstations with Anaconda Python/Jupyter Notebook are available for in-person tutorials in Bobst 617. For remote tutorials, while some patrons decide to approach tutorials as a demonstration of the software, other patrons approach tutorials with a more “hands-on” approach and wish to interact with the software during the tutorial. If the latter is the case, we recommend referencing our supported software page for additional information on accessing the software prior to the tutorial.
Duration: 120 min

Room description:

Some tutorials are held remotely and require NYU sign on to access, while others are held in person, without a remote component. Please note the correct modality and location of the tutorial when registering

Prerequisites:
  • Familiarity with Pandas, Numpy, and core Python objects types (lists, dictionaries, strings, numbers)
  • Familiarity with common data storage file types such as JSON and CSV
  • Comfort with using Jupyter Notebooks for writing code
  • Understanding of (or comfort with learning about) principles of writing code to make http (web) requests
Skills Taught / Learning Outcomes:
  • Pulling data from website front ends, unrestricted web APIs, and web APIs requiring token authorization
  • Parsing API responses in JSON or HTML format to transform them into usable Python data objects
  • Performing quick summary statistics on data
  • Producing quick in-line visualizations in Jupyter Notebooks
Class Materials: https://nyu-dataservices.github.io/DataHarvesting-Python/​
Related Classes:

Introduction to Python

Data Cleaning Using OpenRefine

Data Visualization with Tableau

Introduction to Jupyter Notebooks

Introduction to Research Data Management

 

Additional Training Materials:

Data analysis with Python and Pandas [video]

Python API tutorial - An Introduction to using APIs

Feedback: bit.ly/feedbackds

 

Upcoming sessions for this tutorial