Series and DataFrames
In this tutorial, we'll explore common operations on Pandas Series and DataFrames, highlighting their similarities and differences. We'll cover creating Series and DataFrames, performing basic operations like summation, and understanding the concept of axes.
1. Importing Pandas and Creating Data Structures
First, let's import the Pandas library and create a Series and a DataFrame.
import pandas as pd
# Create a Pandas Series
series = pd.Series([100, 203, 300])
# Create a Pandas DataFrame with nested values
data = {'A': [100, 200, 300], 'B': [200, 400, 600], 'C': [300, 600, 900]}
df = pd.DataFrame(data)
2. Performing Common Operations
Now, let's perform common operations like finding the sum of elements.
2.1 Summation in Series:
# Sum of elements in a Series
series_sum = series.sum()
print("Sum of elements in Series:", series_sum)
2.2 Summation in DataFrame:
# Sum of elements in each column of a DataFrame
column_sum = df.sum()
print("Sum of elements in each column of DataFrame:")
print(column_sum)
3. Understanding Axis
The axis
parameter specifies the axis along which the operation is performed. Default is axis=0
which means the operation is performed along the rows.
3.1 Summation along Columns (Axis 1):
# Sum of elements along columns (axis=1) in DataFrame
row_sum = df.sum(axis=1)
print("Sum of elements along columns (axis=1) in DataFrame:")
print(row_sum)
3.2 Handling Axis in Series:
When working with a Series, the concept of axis doesn't apply in the same way as it does in DataFrames.
# Uncommenting the below line would result in an exception:
# series_sum_axis1 = series.sum(axis=1)
# Series has only one dimension, so specifying axis=1 doesn't make sense.
# It would raise an exception.
4. Choosing Axis Representation
You can specify axis using either integers (0
for rows, 1
for columns) or strings ('index'
for rows, 'columns'
for columns).
# Specifying axis using string representation ('index' or 'columns')
row_sum_string_axis = df.sum(axis='index') # Equivalent to axis=0
column_sum_string_axis = df.sum(axis='columns') # Equivalent to axis=1
print("Sum of elements using string representation of axis:")
print("Row sum:", row_sum_string_axis)
print("Column sum:", column_sum_string_axis)
5. Conclusion
When choosing between integer or string representation of axis, consider readability and personal preference. Both representations are acceptable, but shorter representations (0
and 1
) can save space in your code.
# Choose based on personal preference and readability
# Longer representation for clarity:
df_max_columns_axis = df.max(axis='columns')
# Shorter representation for concise code:
df_max_columns_axis_short = df.max(axis=1)