Python for AI (Part 4): Visualizing Data with Matplotlib

Python for AI (Part 4): Visualizing Data with Matplotlib
Python for AI (Part 4): Visualizing Data with Matplotlib

Python for AI (Part 4): Visualizing Data with Matplotlib

{getToc} $title={Table of Contents} $count={true}

Introduction

This is Part 4 in your Python-for-AI journey! And if you’ve got the hang of Pandas from Part 3, you’re all set to tackle  Matplotlib —Python’s workhorse for data visualization. In AI, data visualization allows you to discover patterns, identify outliers, and communicate your findings. Building on your Java skills and using your knowledge of Python background, this tutorial is where we get down & dirty with practical examples.

What is Matplotlib and Why Use It?

Matplotlib generates line graphs, bar charts, histograms, etc. It is critical for understanding datasets and model results in AI.

  • Plot data distributions to check for outliers.
  • Visualize trends (e.g., temperature over time).
  • Show model performance (e.g., accuracy over iterations).

To get started, install Matplotlib:

pip install matplotlib

Then import it (we’ll use Pandas too, since it pairs nicely):

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

Basic Plotting: Line Plot

Let’s start with a simple line plot using the student data from Part 3:

# Data from Part 3
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'Dana'],
    'Math': [85, 92, 78, 95],
    'Science': [88, 85, 90, 92]
}
df = pd.DataFrame(data)

# Plot Math scores
plt.plot(df['Name'], df['Math'], marker='o', label='Math')
plt.plot(df['Name'], df['Science'], marker='s', label='Science')
plt.title('Student Exam Scores')
plt.xlabel('Student')
plt.ylabel('Score')
plt.legend()
plt.grid(True)
plt.show()

What’s Happening:

  • plt.plot() creates a line with points (marker='o' for circles, 's' for squares).
  • title, xlabel, ylabel label the plot.
  • legend shows which line is which.
  • grid adds a background grid.

Output: A line graph comparing Math and Science scores across students.

This plots Math and Science scores with markers and a legend—great for tracking trends.

Bar Chart: Comparing Categories

Bar charts excel at comparisons. Plot average scores:

# Add Average column
df['Average'] = (df['Math'] + df['Science']) / 2

# Bar chart
plt.bar(df['Name'], df['Average'], color=['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728'])
plt.title('Average Scores by Student')
plt.xlabel('Student')
plt.ylabel('Average Score')
plt.show()

What’s Happening:

  • plt.bar() makes bars, with custom colors for each student.
  • No lines, just bars—great for comparisons.

Output: A colorful bar chart showing each student’s average score.

Each bar gets a unique color—perfect for quick visual insights.

Scatter Plot: Exploring Relationships

Scatter plots reveal relationships between variables. Let’s plot Math vs. Science scores:

plt.scatter(df['Math'], df['Science'], color='purple', s=100, alpha=0.6)
plt.title('Math vs. Science Scores')
plt.xlabel('Math Score')
plt.ylabel('Science Score')
for i, name in enumerate(df['Name']):
    plt.annotate(name, (df['Math'][i], df['Science'][i]), xytext=(5, 5), textcoords='offset points')
plt.show()

What’s Happening:

  • plt.scatter() plots points (s for size, alpha for transparency).
  • annotate labels each point with the student’s name.

Output: A scatter plot showing if Math and Science scores correlate, with names next to points.

Points are labeled with names—ideal for spotting correlations in AI data.

Subplots: Multiple Views

In AI, you often compare multiple aspects. Subplots let you do this in one figure. Subplots compare data side by side:

# 1 row, 2 columns
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 4))

# Plot 1: Math scores
ax1.plot(df['Name'], df['Math'], 'b-o')
ax1.set_title('Math Scores')
ax1.set_xlabel('Student')
ax1.set_ylabel('Score')

# Plot 2: Science scores
ax2.plot(df['Name'], df['Science'], 'g-s')
ax2.set_title('Science Scores')
ax2.set_xlabel('Student')
ax2.set_ylabel('Score')

# Adjust spacing
plt.tight_layout()
plt.show()

What’s Happening:

  • subplots() creates two side-by-side plots.
  • Each ax object is a separate plot with its own settings.

Output: Two line plots comparing Math and Science scores side by side.

Two plots in one—handy for multi-angle analysis in AI.

Time Series: Real-World AI Example

Imagine you’re analyzing daily temperatures for an AI weather model. Visualize daily temperatures for a weather model:

# Weather data
weather = pd.DataFrame({
    'Day': ['Mon', 'Tue', 'Wed', 'Thu', 'Fri'],
    'Temp': [22, 25, 19, 28, 24]
})
plt.plot(weather['Day'], weather['Temp'], 'r-^', label='Temperature')
plt.title('Daily Temperatures')
plt.xlabel('Day')
plt.ylabel('Temperature (°C)')
plt.legend()
plt.xticks(rotation=45) # Rotate x-axis labels
plt.show()

What’s Happening:

  • A time series plot with triangle markers ('^') and rotated labels for readability.
  • Each ax object is a separate plot with its own settings.

Output: A plot showing temperature trends over days—perfect for spotting patterns.

Rotated labels keep it readable—great for time-based AI tasks.

Histogram: Data Distribution

In AI, histograms help you understand data spread. Let’s plot all scores:

all_scores = pd.concat([df['Math'], df['Science']])
plt.hist(all_scores, bins=5, color='skyblue', edgecolor='black')
plt.title('Distribution of All Scores')
plt.xlabel('Score')
plt.ylabel('Frequency')
plt.show()

What’s Happening:

  • plt.hist() bins scores into 5 groups.
  • edgecolor outlines bars for clarity.

Output: A histogram showing how scores are distributed—useful for checking normality in AI datasets.

This helps check data normality—a must for AI preprocessing.

Try It Yourself: An Exercise

Use this weather data:

weather = pd.DataFrame({
    'Day': ['Mon', 'Tue', 'Wed', 'Thu', 'Fri'],
    'Temp': [22, 25, 19, 28, 24],
    'Rain': [0.1, 0.0, 0.3, 0.2, 0.0]
})

Tasks:

  1. Create a bar chart of temperatures.
  2. Add a second bar chart for rainfall using dual axes (hint: plt.twinx()).
  3. Label everything clearly.

Try it, then check the solution!

Solution

This dual-axis plot compares temperature and rainfall elegantly.

Code Demo

Next Steps

You’re now a Matplotlib pro! Then do some advanced plotting with Seaborn or start using it for machine learning with scikit-learn. Try out this exercise — change colors or add grids — then tell me what comes next!

Conclusion

Explore AI with Data Visualization in Matplotlib Basics You’ve trained yourself to plot, compare, and analyze data—essential Java-to-Python mapping to gear your expertise. Stay course and you are headed for the path of AI success!

Related Post



Previous Post Next Post