How to deal with repetitive chunks of SQL code: let a UDF create your GA4 channel group dimensions in BigQuery
This tutorial explains how to use user-defined functions (UDFs) in BigQuery to handle repetitive tasks in your queries, such as creating the default channel group logic. UDFs are a way to create your own custom functions using SQL. This can save time and make your queries more readable.
If you are a frequent user of default channel groups in GA4, you probably are already aware that the GA4 export in BigQuery doesnβt contain those fields. Youβll need to create them yourself in the queries if you want to make use of them.
Handling repetitive tasks
Both the first user default channel group as the session default channel group can be created using a monstrous case when-statement, based on Googleβs rule based definitions. If you regularly reuse channel groups as a dimension in your BigQuery results, it can be a pain in the proverbial ass to copy paste that big chunk of code into every query.
To improve your productivity in BigQuery, you might want to deep dive into the world of user-defined functions (UDFs). In this tutorial you will see just how powerful they can be when it comes to handling repetitive tasks in your queries.
Use case: what are channel groups?
Channel groups in GA4 make it easy to organise data in your user and traffic acquisition reports. Instead of seeing tons of sources and/or mediums in one report, you can use channel groupings to keep things tidy and easy to understand.
These channels include values like Organic Search, Referrals, Direct, Email and Paid Social. You can also customise the channel groups in Google Analytics to create unique channels that makes sense for your business.
Regarding the SQL logic of the channel groups: of course we already did the heavy lifting for you and some proper example queries can be found in the dimensions & smetrics section on this platform.
The solution: what are UDFs?
But before we get started, let's quickly go over what exactly UDFs are and why they're so useful. Essentially, UDFs in BigQuery are a way to create your own custom functions using either SQL or JavaScript. These functions can take an input argument, perform a specific task, and then produce an output. In other words, it will save you time and your queries will be more readable.
This is incredibly helpful when you have notorious repetitive tasks to run in your query, such as the default channel group logic. There are countless other use cases you can think of, from UDFs that clean up values (e.g. strip parameters from your urls) to a function that automagically grabs the necessary ingredients and creates a unique session id for each session.
But as said, in this tutorial we will limit our imagination (for now) and focus on the default channel group use case.
Two kinds of UDFs
A persistent UDF is a user-defined function that you create and save in a specific dataset within your BigQuery project. Once created, it can be called and reused in any query that is run by a user who has access to that dataset. This allows you to create a function once and then use it multiple times, which can be particularly helpful when you have repetitive tasks that need to be performed.
Temporary (or non-persistent) UDFs are functions that are only available for the duration of the current query. They can only be used within the specific query in which they are defined, and they are not stored or reusable.
By now it is hopefully clear why weβre going for a persistent UDF for our use case: we want to be able to reuse it infinite times.
Let's dive into how to create such a persistent UDF, using SQL.
In this tutorial you will learn how to:
- create your own user-defined functions (UDFs) in BigQuery to handle repetitive tasks, such as creating default channel groups
- understand the difference between persistent and temporary UDFs, and why persistent UDFs are more useful for repetitive tasks