Python is one of the world's most popular programming language for dealing with large data sets. Any data scientist will tell you that most datasets contain missing data in one form or another. Knowing how to deal with missing data is a very important skill because of this.
In this tutorial, I will teach you how to deal with missing data in the pandas library for Python.
The Importance of Education
Before beginning this tutorial on how to deal with missing data in pandas, I wanted to speak for a moment on the importance of education.
Education - especially online education - is more important than ever. I am a living testimonial to this. I have stumbled into a wonderful career as a Python programmer and educator using information I learned entirely through this Internet. Today, I make a living teaching people how to code through my website.
This tutorial is designed to be an example of how you can further your education through the Internet.
How to Import Pandas
This tutorial will teach you how to deal with missing data in the pandas library for Python.
Pandas is a portmanteau for "panel data", and is an open-source software library for dealing with tabular data. "Tabular data" just means two-dimensional data that is formatted into columns and rows. You can think of a Microsoft Excel document as a textbook example of tabular data in the world today.
The first thing you'll need to do is import the pandas library. It is convention to import pandas under the alias `pd`. Here is the full command to do this:
import pandas as pd
Now that we have imported the pandas library, we are ready to proceed with our tutorial.
How to Create a DataFrame With Missing Data
The next thing we need to do is create a pandas DataFrame with some missing data:
df = pd.DataFrame(np.array([[1, 5, 1],[2, np.nan, 2],[np.nan, np.nan, 3]]))
df.columns = ['A', 'B', 'C']
We now have a pandas DataFrame that contains missing data. We will learn how to deal with this missing data in the next section of this tutorial.
How to Deal With Missing Data in Pandas
The pandas library has two built-in methods that are both excellent for dealing with missing data: `fillna` and `dropna`.
Here is how you could use the `fillna` method to replace every missing data point with the fire emoji:
Here is how you could use the `dropna` method to remove every row that contains missing data from the DataFrame;
As you can see, dealing with missing data within a pandas DataFrame is much easier than you might think!
In this tutorial, you learned how to deal with missing data in the pandas programming library for Python.
Here is a brief summary of what was discussed in this tutorial:
The importance of online education in 2020 (and beyond)
An introduction to the pandas library for Python
How to create a pandas DataFrame that contains missing data
How to either remove or replace the missing data from the pandas DataFrame
This was an excellent example of how online education tutorials can help you advance your skill set. As you proceed through your career as a Python developer, remember that it's possible to learn almost anything through the Internet.
Keep learning about coding and programming during COVID-19 with RobotLAB and CoderZ!
CoderZ is an online educational environment that improves students 21st century skills, while they are having fun programming their own virtual cyber robot. CoderZ and RobotLAB has different lessons to do at home! Check them out Here