Pivot Tables in SQL
A pivot table is a tool for summarizing and reorganizing large datasets. It is typically used to turn rows into columns, i.e. “pivot” the data.
Example - Pivot Table in Action?
Imagine you have a sales dataset that records transactions across different cities and dates. The table includes columns like City
, Date
, and Sales
. In its raw form, this data might be too cumbersome to extract meaningful insights, especially if you want to compare sales performance across cities or understand sales trends over time.
Raw Data
City | Date | Sales |
---|---|---|
New York | 2023-01-01 | 200 |
Los Angeles | 2023-01-01 | 150 |
New York | 2023-01-02 | 180 |
… | … | … |
Objective
Let’s say you want to understand the total sales per city.
Using a pivot table, you can restructure this data to have cities as columns, and the sum of sales per date as rows. This rearrangement makes it easier to compare the sales performance of different cities side by side.
Pivoted Data
Date | New York | Los Angeles |
---|---|---|
2023-01-01 | 200 | 150 |
2023-01-02 | 180 | … |
… | … | … |
This transformed view allows for an easier comparison of sales performance across cities and observation of sales trends over time.
Pivot in SQL
Pivot queries in SQL serve a similar purpose. They transform categorical data from rows into columns, enabling data summarization and reformatting.
Steps for a Pivot Query in SQL
Identify the Pivot Column: For our example, it’s
City
.Select the Aggregation Column: Here, it’s
Sales
.Choose the Aggregation Function: In this case, we use
SUM
to calculate total sales.Choose the Rows for Grouping: We’ll group the data by
Date
.Write the SQL Query:
SELECT
Date,
SUM(CASE WHEN City = 'New York' THEN Sales ELSE 0 END) AS New_York_Sales,
SUM(CASE WHEN City = 'Los Angeles' THEN Sales ELSE 0 END) AS Los_Angeles_Sales
FROM SalesData
GROUP BY Date;
This query produces a table with dates as rows, and separate columns for the summed sales in New York and Los Angeles for each date.