Pandas Tutorial for Beginners

Python Pandas Tutorial for Beginners Data Science & Machine Learning

Using read_csv()

If you use Anaconda, you need to make sure the CSV file is in the same folder as your IPYNB file.

If you use Google Colab, you need to upload the file to session storage. To do that, click on the “Files” tab on the left, followed by the “Upload to session storage” icon on top. You may need to wait for the session to load before the icon appears. Navigate to where the CSV file is stored on your local drive and upload it to Google Colab. You need to do this every time you get disconnected from Google Colab.

To read the pandasDemo.csv file, we use the read_csv() method in pandas. This method accepts the path of our file as an argument and reads the file into a DataFrame:

classData = pd.read_csv('pandasDemo.csv')

The statement above reads pandasDemo.csv into a DataFrame called classData. We’ll be working with this DataFrame for the rest of the chapter.

3.6 Exploring data in a DataFrame

When we first load data into a DataFrame, we typically want to have a quick look at the data. This can be done using the head() method, which shows the first five rows of a DataFrame. If we want more than five rows, we pass an integer to the method. For instance, to get the first six rows in classData, we write:

classData.head(6)
Displaying the first six rows of a DataFrame

We can also generate some descriptive statistics for the numerical columns in our dataset using the describe()method:

classData.describe()
Using the describe() method

Changing labels of row(s) and column(s)

Next, we can change the row and column labels of our DataFrame using the rename() method. To do that, we pass the index and columns parameters to the method:

classData = classData.rename(columns={'TA':'Assistant'}, index={0:'Row Zero',1:'Row One'})
classData.head()

The code above changes the label of the TA column to ‘Assistant’, and the labels of rows 0 and 1 to ‘Row Zero’and ‘Row One’, respectively.

By default, the rename() method (and most of the other methods in pandas) does not change the DataFrame directly. Instead, it returns a new DataFrame. When this happens, we say that the change is not in place (i.e., the original DataFrame is not modified).

If we want classData to have the new labels, we need to assign the returned DataFrame back to classData, which we did in this example. If you run the code above, you’ll get the following output:

Changing the labels of a DataFrame

Leave a Reply

Prev
NumPy Tutorial for Beginners
NumPy Tutorial for Beginners Data Science & Machine Learning

NumPy Tutorial for Beginners

Selecting Data from a NumPy Array Similar to what we do with Python lists, we

Next
Matplotlib Tutorial for Beginners
Matplotlib Tutorial for Beginners with Python Data Science Machine Learning

Matplotlib Tutorial for Beginners

There are many Python libraries for plotting charts

You May Also Like