DATA readings
Feature Engineering walks through various data preparation methods as well as PCA.
The following excerpts are from a data wrangling in Python resource. When reading keep an eye out for some of the methods used (some of which you’ve likely seen already), and general ideas of how to begin working with data (it’s okay if you don’t understand every single example). We’ll cover more specific applications in class.
Working with a Pandas DataFrame describes how to look at and access features and entries inside a pandas DataFrame.
Reading in data via read_csv aalks through different ways to grab data. We’ll mainly pull csvs from a url (often via git repo), so skim this one.
Data exploring walks through some of the basics of EDA (some of this may look familiar to code from previous PAs).
Data summarization walks through how to do some summarization and transformations common in data cleaning.