Using read_csv()
If you use Anaconda, you need to make sure the CSV file is in the same folder as your IPYNB file.
If you use Google Colab, you need to upload the file to session storage. To do that, click on the “Files” tab on the left, followed by the “Upload to session storage” icon on top. You may need to wait for the session to load before the icon appears. Navigate to where the CSV file is stored on your local drive and upload it to Google Colab. You need to do this every time you get disconnected from Google Colab.
To read the pandasDemo.csv file, we use the read_csv() method in pandas. This method accepts the path of our file as an argument and reads the file into a DataFrame:
classData = pd.read_csv('pandasDemo.csv')
The statement above reads pandasDemo.csv into a DataFrame called classData. We’ll be working with this DataFrame for the rest of the chapter.
3.6 Exploring data in a DataFrame
When we first load data into a DataFrame, we typically want to have a quick look at the data. This can be done using the head() method, which shows the first five rows of a DataFrame. If we want more than five rows, we pass an integer to the method. For instance, to get the first six rows in classData, we write:
classData.head(6)
We can also generate some descriptive statistics for the numerical columns in our dataset using the describe()method:
classData.describe()
Changing labels of row(s) and column(s)
Next, we can change the row and column labels of our DataFrame using the rename() method. To do that, we pass the index and columns parameters to the method:
classData = classData.rename(columns={'TA':'Assistant'}, index={0:'Row Zero',1:'Row One'})
classData.head()
The code above changes the label of the TA column to ‘Assistant’, and the labels of rows 0 and 1 to ‘Row Zero’and ‘Row One’, respectively.
By default, the rename() method (and most of the other methods in pandas) does not change the DataFrame directly. Instead, it returns a new DataFrame. When this happens, we say that the change is not in place (i.e., the original DataFrame is not modified).
If we want classData to have the new labels, we need to assign the returned DataFrame back to classData, which we did in this example. If you run the code above, you’ll get the following output: