Pandas — Reading Raw Data

Pandas is a powerful data organization and analysis toolbox in Python. It has functions and methods that can help you clean, organize and present data in much nicer way than any other data structures.

I will not be able to cover all Pandas functions/methods, but I would rather try to discover the logic and methodology underneath this package. So, you don’t need to remember everything but still be able to get things done efficiently.

First of all, I will talk about reading raw data from various sources using Pandas.

As you can see below, using pd.read_xxx, you can read various data file into Pandas.

Let’s take a close look at read_csv, which is the most common raw data file type.

One trick to get a first look at a method/function that you are using is to press TAB button, and it will show you all the available options.


Usually, you can just feed the file path to read the file.

But sometimes, you got broken data frame, and those reading options can help you get a better result.