Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Data Services Class Descriptions

Information, materials, and schedules for all currently offered Data Services classes
This session is an intermediate level class that will examine ways to perform data cleaning, transformation, and management using Python. We will look at some efficient ways to load data and parse it into a container for ease of use in Python, to store it in helpful formats, and to perform some basic cleaning and transformations typical for mixed string-and-numeric formats. Finally, we'll try putting it all together using a dataset form the NYC Open Data portal.
Software: Python, Jupyter Notebooks
Duration: 120 min

Room description:

During the Fall 2021 semester, some tutorials are held remotely and require NYU sign on to access, while others are held in person, without a remote component. Please note the correct modality and location of the tutorial when registering

Prerequisites:
  • Ability to set and understand the object type of a variable
  • Familiarity with foundational object types (lists, strings, numbers, dictionaries) in Python
  • Familiarity with common data storage file types such as JSON and CSV
  • Comfort with, or willingness to learn more about dataframe and array objects in Python
  • Comfort with using Jupyter Notebooks for writing code
Skills Taught / Learning Outcomes:
  • Transforming common formats for distributing data (CSV, JSON) into arrays and dataframes for cleaning and analysis
  • Building simple but robust environments in SQLite using Python’s sqlite3 to store and query larger datasets
  • Data syntactical cleaning and refactoring to enable accurate data analysis
  • Parsing incomplete or ill-formed datasets from open sources for robust research use
Class Materials:
Related Classes:

Data Visualization with Tableau

Data Cleaning Using OpenRefine

Introduction to Jupyter Notebooks

Introduction to Python

Additional Training Materials:

Python for Essential Data Science Training via LinkedIn Learning (NYU NetID required)

Working with SQLite Databases using Python and Pandas

Feedback: bit.ly/feedbackds

 

Upcoming sessions for this tutorial