GA4 Analytics Pipeline

Here I’ll be analyzing the Google Analytics 4 data from the visualization below for the Google Merch store between November 2020 to January 2021.

Data acquisition

We start by acquiring the data; for this analysis, I’ll be using the Google Merchandise store dataset – which is publicly accessible as outlined in this Google Analytics resource page.

Data Processing

The next step is to create the relational schema and data tables. For this use case, I’ll create 5 data tables (using BigQuery):

1 Users: user-scoped dimension data
2. Visit Attributes: session-scoped dimension data
3. Items: lists each item listed on the site, including item descriptions
4. Transactions: event-scoped dimensions and metrics by transaction (transaction id)
5. Key Metrics: event-scoped metrics by visitor (visit id)

Then the next step is to setup the queries for creating each table. I’ll share an example of the query I used to create the “Users” table:

CREATE OR REPLACE TABLE `xxxxxxxxxxxxx.ga4_gms_sample_data.01_users` AS


SELECT

CONCAT(user_pseudo_id, (SELECT value.int_value FROM UNNEST(event_params) WHERE KEY = ‘ga_session_id’)) AS visit_id

MIN(user_pseudo_id) AS user_id,

SUM(CASE WHEN event_name = ‘session_start’ THEN 1 ELSE 0 END) session_starts,

MIN(PARSE_DATE(‘%Y%m%d’, event_date)) AS first_visit_date,

MAX(PARSE_DATE(‘%Y%m%d’, event_date)) AS last_visit_date,

MIN(DATETIME(TIMESTAMP_MICROS(event_timestamp))) AS first_visit_timestamp,

MAX(DATETIME(TIMESTAMP_MICROS(event_timestamp))) AS last_visit_timestamp,

FROM

`bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`

GROUP BY

1;

Data analysis

With the data now processed and store in our database, we can do some exploratory analysis of the topline numbers to get a sense of what the traffic volume is, how much transactions have occurred on the website for the period (volume and value), the average order value, and conversion rate.

Visualization

To visualize the data in Power BI, we can pull in the data tables from BigQuery into the Power BI data model with the following steps:

1. Open Power Bi
2. Select the “Get data” dropdown and click “More…”
3. Under “Database”, select “Google BigQuery”
4. After granting the necessary permissions, select the 5 data tables shown in the data processing step above
5. Load the data

Email: s.o@sesandatalab.com

© 2024 Sesan Data Lab