JitCoder

Pandas Tutorial for Beginners (Step-by-Step Guide)

Introduction

Pandas is one of the most powerful Python libraries used for data analysis and data manipulation. Whether you are working in data science, machine learning, or backend development, learning Pandas is essential.

In this tutorial, you will learn Pandas from scratch, including installation, DataFrames, operations, and real examples.


What is Pandas

Pandas is an open-source Python library used for handling structured data efficiently. It provides easy-to-use data structures and tools for data analysis.

Key Features

  • Fast and efficient data handling
  • Easy data cleaning and transformation
  • Supports CSV, Excel, SQL data
  • Powerful grouping and filtering

Installing Pandas

You can install Pandas using pip:

pip install pandas

Import Pandas in Python:

import pandas as pd

Pandas Data Structures

Pandas mainly provides two data structures:

1. Series

A Series is a one-dimensional array.

data = [10, 20, 30]
series = pd.Series(data)
print(series)

2. DataFrame

A DataFrame is a two-dimensional table (rows and columns).

data = {
"Name": ["A", "B", "C"],
"Marks": [85, 90, 88]
}df = pd.DataFrame(data)
print(df)

Reading Data in Pandas

Pandas can read data from different sources.

Read CSV File

df = pd.read_csv("data.csv")
print(df)

Read Excel File

df = pd.read_excel("data.xlsx")

Viewing Data

First 5 Rows

df.head()

Last 5 Rows

df.tail()

Data Information

df.info()

Selecting Data

Select Column

df["Name"]

Select Multiple Columns

df[["Name", "Marks"]]

Select Rows

df.iloc[0]

Filtering Data

df[df["Marks"] > 85]

This returns rows where marks are greater than 85.


Adding New Column

df["Grade"] = ["A", "A+", "A"]

Updating Data

df.loc[0, "Marks"] = 95

Deleting Column

df.drop("Grade", axis=1, inplace=True)

Handling Missing Values

Check Missing Values

df.isnull()

Fill Missing Values

df.fillna(0, inplace=True)

Sorting Data

df.sort_values("Marks", ascending=False)

Grouping Data

df.groupby("Grade").mean()

Working with Real Example

data = {
"Name": ["A", "B", "C", "D"],
"Marks": [80, 90, 85, 95]
}df = pd.DataFrame(data)# Filter students with marks > 85
result = df[df["Marks"] > 85]print(result)

Exporting Data

Save to CSV

df.to_csv("output.csv", index=False)

Advantages of Pandas

  • Easy to learn and use
  • Handles large datasets
  • Integrates with NumPy and Matplotlib
  • Widely used in industry

Common Mistakes

  • Not handling missing values
  • Using wrong indexing
  • Forgetting axis in operations
  • Not checking data types

Tips to Learn Pandas Faster

  • Practice daily with datasets
  • Try real-world projects
  • Use Jupyter Notebook
  • Focus on DataFrame operations

Conclusion

Pandas is an essential tool for anyone working with data in Python. In this tutorial, you learned the basics of Pandas including DataFrames, data selection, filtering, and data cleaning.

With regular practice, you can master Pandas and use it in data science, machine learning, and real-world applications.

Leave a Comment

Your email address will not be published. Required fields are marked *