How to sessionize your GA4 event data in BigQuery (part 1: default 30-minute session timeout definition)
This tutorial will introduce you to the concept of sessionization. Step by step you will learn how to create a session table on top of your GA4 event data in BigQuery, so that each row will correspond to a session instead of an event.
Having access to the raw events is irrefutable one of the main benefits of the GA4 data export in BigQuery. But for some analysis or reporting purposes (i.e. when creating a dashboard in Looker Studio) a light-weight table with user activity data aggregated by sessions is all we need.
That's why this tutorial will introduce you to the concept of sessionization. Step by step you will learn how to create a session table on top of your GA4 event data in BigQuery. Per session we will provide:
- user information (i.e. user id)
- session timestamps (first and last event)
- total engagement time
- traffic source information (i.e. session source)
- landing & exit page
- device information (i.e. device category)
- geological information (i.e. country)
- event information (i.e. unique page views)
- ecommerce information (i.e. purchase revenue)
What is sessionization?
Sessionization is the process of organizing raw event data into meaningful groups based on user activity. This can help businesses better understand how users are interacting with their website or app, and can provide valuable insights for improving the user experience and optimizing marketing strategies.
- event table: each row corresponds to one event
- session table: each row corresponds to one session
Sessionization typically involves identifying the start and end of each session, as well as calculating aggregated metrics, such as the session duration, purchase revenue and the total or unique amount of pageviews per session.
What is a session in the context of GA4?
In GA4, a session is a group of user interactions with a website or mobile app that take place within a given time frame. Sessions are typically initiated when a user first visits a website or opens a mobile app, and end when the user is inactive for a certain amount of time (known as the session timeout).
By default the session timeout in GA4 is 30 minutes. However, this is configurable in the user interface (for future sessions). Based on this setting, all events in a session are assigned a ga_session_id
. Those sessions be further divided into individual page views or screen views (in the case of a mobile app), as well as other events such as button clicks, item views, purchases or form submissions.
In this tutorial we will sessionize GA4 event data using the default session definition: the 30-minute session timeout. If another event for the same user is collected after a session timeout, a new ga_session_id
will be assigned.
In part 2 of this tutorial you will learn to sessionize your GA4 event data using custom session definitions. Instead of looking at a specific amount of time that has passed, we will focus on the actions of the user.