Skip to main content

GA4 | dimensions & metrics

Traffic source and user acquisition dimensions & metrics (GA4)

How to query the most important traffic and user acquisition dimensions and metrics from the GA4 BigQuery export, like source, medium, campaign name, default channel grouping and page referrer.

A lot of Google Analytics 4 dimensions and metrics can be derived straight from the tables without performing calculations on the data. With the help of unnest and parse for dates or timestamps we can do the trick.

Other dimensions and metrics that you might need are a bit harder to access. You will have to (re)calculate them, which can require some serious SQL-skills.

💡
The GA4 export contains a major GA4 export misattribution bug where paid search events and sessions are mistaken for organic or direct traffic. Here is a tutorial on how to deal with this bug and make sure that at least source / medium and default channel group dimensions show up in your reports.

While designing the course Query GA4 Data In Google BigQuery I've learned a lot about best practices to calculate dimensions and metrics. To share this knowledge with you I will provide a combined query for default dimensions and metrics, and a single example query for every non-default dimension or metric.

As always, drop a line in the comments if you have any questions, feedback or suggestions related to this article.

💡
Reclaim a significant amount of your time by subscribing to our premium membership. You'll gain immediate access to a ready-to-go GA4 user acquisition or session acquisition report query.

Default traffic source and user acquisition dimensions

💡
If you only need one default dimension or metric, look at the -- comments in the example query below for names and definitions and copy the part you need from the select clause. Make sure that you also add any additional conditions (i.e. with, from, where, group by, having and order by) that are necessary to calculate the results correctly. E.g, the query below is ungrouped, so every row corresponds to an event and may contain duplicate rows.

User scope

Based on first session per user:

  • traffic_source.name (campaign)
  • traffic_source.medium
  • traffic_source.source
select
    -- traffic_source.name (dimension | name of the marketing campaign that first acquired the user)
    traffic_source.name,
    -- traffic_source.medium (dimension | name of the medium (paid search, organic search, email, etc.) that first acquired the user)
    traffic_source.medium,
    -- traffic_source.source (dimension | name of the network that first acquired the user)
    traffic_source.source
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20240701'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))

Event scope

Available from 2023-05-03:

  • collected_traffic_source.manual_campaign_id
  • collected_traffic_source.manual_campaign_name
  • collected_traffic_source.manual_source
  • collected_traffic_source.manual_medium
  • collected_traffic_source.manual_term
  • collected_traffic_source.manual_content
  • collected_traffic_source.gclid
  • collected_traffic_source.dclid
  • collected_traffic_source.srsltid

Available from 2024-07-11:

  • collected_traffic_source.manual_source_platform
  • collected_traffic_source.manual_creative_format
  • collected_traffic_source.manual_marketing_tactic
select
    -- collected_traffic_source.manual_campaign_id (dimension | the manual campaign id (utm_id) that was collected with the event)
    collected_traffic_source.manual_campaign_id,
    -- collected_traffic_source.manual_campaign_name (dimension | the manual campaign name (utm_campaign) that was collected with the event)
    collected_traffic_source.manual_campaign_name,
    -- collected_traffic_source.manual_source (dimension | the manual campaign source (utm_source) that was collected with the event)
    collected_traffic_source.manual_source,
    -- collected_traffic_source.manual_medium (dimension | the manual campaign medium (utm_medium) that was collected with the event)
    collected_traffic_source.manual_medium,
    -- collected_traffic_source.manual_term (dimension | the manual campaign keyword/term (utm_term) that was collected with the event)
    collected_traffic_source.manual_term,
    -- collected_traffic_source.manual_content (dimension | the additional manual campaign metadata (utm_content) that was collected with the event)
    collected_traffic_source.manual_content,
    -- collected_traffic_source.gclid (dimension | the Google click identifier that was collected with the event)
    collected_traffic_source.gclid,
    -- collected_traffic_source.dclid (dimension | the DoubleClick Click Identifier for Display and Video 360 and Campaign Manager 360 that was collected with the event)
    collected_traffic_source.dclid,
    -- collected_traffic_source.srsltid (dimension | the Google Merchant Center identifier that was collected with the event)
    collected_traffic_source.srsltid,
    -- collected_traffic_source.manual_source_platform (dimension | the manual campaign source platform (utm_source_platform) that was collected with the event)
    collected_traffic_source.manual_source_platform,
    -- collected_traffic_source.manual_creative_format (dimension | the manual campaign creative format (utm_creative_format) that was collected with the event)
    collected_traffic_source.manual_creative_format,
    -- collected_traffic_source.manual_marketing_tactic (dimension | the manual campaign marketing tactic (utm_marketing_tactic) that was collected with the event)
    collected_traffic_source.manual_marketing_tactic
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20240701'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))

Session scope

Available from 2024-07-17:

  • session_traffic_source_last_click.manual_campaign.campaign_id
  • session_traffic_source_last_click.manual_campaign.campaign_name
  • session_traffic_source_last_click.manual_campaign.source
  • session_traffic_source_last_click.manual_campaign.medium
  • session_traffic_source_last_click.manual_campaign.term
  • session_traffic_source_last_click.manual_campaign.content
  • session_traffic_source_last_click.manual_campaign.source_platform
  • session_traffic_source_last_click.manual_campaign.creative_format
  • session_traffic_source_last_click.manual_campaign.marketing_tactic
  • session_traffic_source_last_click.google_ads_campaign
  • session_traffic_source_last_click.google_ads_campaign.customer_id
  • session_traffic_source_last_click.google_ads_campaign.account_name
  • session_traffic_source_last_click.google_ads_campaign.campaign_id
  • session_traffic_source_last_click.google_ads_campaign.campaign_name
  • session_traffic_source_last_click.google_ads_campaign.ad_group_id
  • session_traffic_source_last_click.google_ads_campaign.ad_group_name
select
    -- session_traffic_source_last_click.manual_campaign.campaign_id (dimension | the ID of the last clicked manual campaign)
    session_traffic_source_last_click.manual_campaign.campaign_id,
    --session_traffic_source_last_click.manual_campaign.campaign_name (dimension | the name of the last clicked manual campaign)
    session_traffic_source_last_click.manual_campaign.campaign_name,
    -- session_traffic_source_last_click.manual_campaign.source (dimension | the specific source within the platform of the last clicked manual campaign)
    session_traffic_source_last_click.manual_campaign.source,
    -- session_traffic_source_last_click.manual_campaign.medium (dimension | the medium of the last clicked manual campaign)
    session_traffic_source_last_click.manual_campaign.medium,
    -- session_traffic_source_last_click.manual_campaign.term (dimension | the keyword/search term of the last clicked manual campaign)
    session_traffic_source_last_click.manual_campaign.term,
    -- session_traffic_source_last_click.manual_campaign.content (dimension | additional metadata of the last clicked manual campaign)
    session_traffic_source_last_click.manual_campaign.content,
    -- session_traffic_source_last_click.manual_campaign.source_platform (dimension | the platform of the last clicked manual campaign)
    session_traffic_source_last_click.manual_campaign.source_platform,
    -- session_traffic_source_last_click.manual_campaign.creative_format (dimension | the format of the creative of the last clicked manual campaign)
    session_traffic_source_last_click.manual_campaign.creative_format,
    -- session_traffic_source_last_click.manual_campaign.marketing_tactic (dimension | the marketing tactic of the last clicked manual campaign)
    session_traffic_source_last_click.manual_campaign.marketing_tactic,
    -- session_traffic_source_last_click.google_ads_campaign.customer_id (dimension | the customer ID associated with the Google Ads account)
    session_traffic_source_last_click.google_ads_campaign.customer_id,
    -- session_traffic_source_last_click.google_ads_campaign.account_name (dimension | the name of the Google Ads account)
    session_traffic_source_last_click.google_ads_campaign.account_name,
    -- session_traffic_source_last_click.google_ads_campaign.campaign_id (dimension | the ID of the Google Ads campaign)
    session_traffic_source_last_click.google_ads_campaign.campaign_id,
    -- session_traffic_source_last_click.google_ads_campaign.campaign_name (dimension | the name of the Google Ads campaign)
    session_traffic_source_last_click.google_ads_campaign.campaign_name,
    -- session_traffic_source_last_click.google_ads_campaign.ad_group_id (dimension | the ID of the ad group within the Google Ads campaign)
    session_traffic_source_last_click.google_ads_campaign.ad_group_id,
    -- session_traffic_source_last_click.google_ads_campaign.ad_group_name (dimension | the name of the ad group within the Google Ads campaign)
    session_traffic_source_last_click.google_ads_campaign.ad_group_name
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20240701'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))

Calculated traffic source and user acquisition dimensions

  • campaign (based on sessions)
with prep as (
select
    user_pseudo_id,
    (select value.int_value from unnest(event_params) where key = 'ga_session_id') as session_id,
    array_agg((select value.string_value from unnest(event_params) where key = 'campaign') ignore nulls order by event_timestamp)[safe_offset(0)] as campaign,
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20240701'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))
group by
    user_pseudo_id,
    session_id)

select
    -- session campaign (dimension | the value of a campaign associated with a session)
    coalesce(campaign,'(not set)') as campaign_session,
    count(distinct concat(user_pseudo_id,session_id)) as sessions
from
    prep
group by
    campaign_session
order by
    sessions desc
  • medium (based on session)
with prep as (
select
	user_pseudo_id,
    (select value.int_value from unnest(event_params) where key = 'ga_session_id') as session_id,
    array_agg((select value.string_value from unnest(event_params) where key = 'medium') ignore nulls order by event_timestamp)[safe_offset(0)] as medium
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20240701'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))
group by
    user_pseudo_id,
    session_id)

select
	-- session medium (dimension | the value of a medium associated with a session)
    coalesce(medium,'(none)') as medium_session,
    count(distinct concat(user_pseudo_id,session_id)) as sessions
from
    prep
group by
    medium_session
order by
    sessions desc
  • source (based on session)
with prep as (
select
	user_pseudo_id,
    (select value.int_value from unnest(event_params) where key = 'ga_session_id') as session_id,
    array_agg((select value.string_value from unnest(event_params) where key = 'source') ignore nulls order by event_timestamp)[safe_offset(0)] as source
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20240701'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))
group by
    user_pseudo_id,
    session_id)

select
    -- session source (dimension | the value of a source associated with a session)
    coalesce(source,'(direct)') as source_session,
    count(distinct concat(user_pseudo_id,session_id)) as sessions
from
    prep
group by
    source_session
order by
    sessions desc
  • source / medium (based on user)
select
    concat(coalesce(traffic_source.source,'(direct)'),' / ',coalesce(traffic_source.medium,'(none)')) as source_medium_user,
    count(distinct user_pseudo_id) as users
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20240701'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))
group by
    source_medium_user
order by
    users desc
  • source / medium (based on session)
with prep as (
select
    user_pseudo_id,
    (select value.int_value from unnest(event_params) where key = 'ga_session_id') as session_id,
    array_agg((select value.string_value from unnest(event_params) where key = 'source') ignore nulls order by event_timestamp)[safe_offset(0)] as source,
    array_agg((select value.string_value from unnest(event_params) where key = 'medium') ignore nulls order by event_timestamp)[safe_offset(0)] as medium
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20240701'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))
group by
    user_pseudo_id,
    session_id)

select
    -- session source / medium (dimension | the value of a source and medium associated with a session)
    concat(coalesce(source,'(direct)'),' / ',coalesce(medium,'(none)')) as source_medium_session,
    count(distinct concat(user_pseudo_id,session_id)) as sessions
from
    prep
group by
    source_medium_session
order by
    sessions desc
  • default channel grouping (based on user)
💡
This query is based on Google's documentation about how GA4 classifies traffic. Please mind that not all conditions can be reconstructed, because some variables are not available in the BigQuery export.
select
    -- user default channel grouping (dimension | the channel group associated with an user's first session)
    case
    when (traffic_source.source is null or traffic_source.source = '(direct)') and (traffic_source.medium is null or traffic_source.medium in ('(not set)', '(none)')) then 'Direct'
    when traffic_source.name like '%cross-network%' then 'Cross-network'
    when (regexp_contains(traffic_source.source,'alibaba|amazon|google shopping|shopify|etsy|ebay|stripe|walmart')
    or regexp_contains(traffic_source.name, '^(.*(([^a-df-z]|^)shop|shopping).*)$'))
  and regexp_contains(traffic_source.medium, '^(.*cp.*|ppc|retargeting|paid.*)$') then 'Paid Shopping'
    when regexp_contains(traffic_source.source,'baidu|bing|duckduckgo|ecosia|google|yahoo|yandex') and regexp_contains(traffic_source.medium,'^(.*cp.*|ppc|retargeting|paid.*)$') then 'Paid Search'
    when regexp_contains(traffic_source.source,'badoo|facebook|fb|instagram|linkedin|pinterest|tiktok|twitter|whatsapp')
  and regexp_contains(traffic_source.medium,'^(.*cp.*|ppc|retargeting|paid.*)$') then 'Paid Social'
    when regexp_contains(traffic_source.source,'dailymotion|disneyplus|netflix|youtube|vimeo|twitch|vimeo|youtube') and regexp_contains(traffic_source.medium,'^(.*cp.*|ppc|retargeting|paid.*)$') then 'Paid Video'
    when traffic_source.medium in ('display','banner','expandable','interstitial','cpm') then 'Display'
    when regexp_contains(traffic_source.medium,'^(.*cp.*|ppc|retargeting|paid.*)$') then 'Paid Other'
    when regexp_contains(traffic_source.source,'alibaba|amazon|google shopping|shopify|etsy|ebay|stripe|walmart') or regexp_contains(traffic_source.name, '^(.*(([^a-df-z]|^)shop|shopping).*)$') then 'Organic Shopping'
    when regexp_contains(traffic_source.source,'badoo|facebook|fb|instagram|linkedin|pinterest|tiktok|twitter|whatsapp') or traffic_source.medium in ('social','social-network','social-media','sm','social network','social media') then 'Organic Social'
    when regexp_contains(traffic_source.source,'dailymotion|disneyplus|netflix|youtube|vimeo|twitch|vimeo|youtube') or regexp_contains(traffic_source.medium,'^(.*video.*)$') then 'Organic Video'
    when regexp_contains(traffic_source.source,'baidu|bing|duckduckgo|ecosia|google|yahoo|yandex') or traffic_source.medium = 'organic' then 'Organic Search'
    when traffic_source.medium in ('referral','app','link') then 'Referral'
    when regexp_contains(traffic_source.source,'email|e-mail|e_mail|e mail') or regexp_contains(traffic_source.medium,'email|e-mail|e_mail|e mail') then 'Email'
    when traffic_source.medium = 'affiliate' then 'Affiliates'
    when traffic_source.medium = 'audio' then 'Audio'
    when traffic_source.source = 'sms' or traffic_source.medium = 'sms' then 'SMS'
    when traffic_source.medium like '%push' or regexp_contains(traffic_source.medium,'mobile|notification') or traffic_source.source = 'firebase' then 'mobile push notifications'
    else 'Unassigned'
end
  as channel_grouping_user,
    count(distinct user_pseudo_id) as users
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20240701'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))
group by
    channel_grouping_user
order by
    users desc
  • default channel grouping (based on session)
💡
This query is based on Google's documentation about how GA4 classifies traffic. Please mind that not all conditions can be reconstructed, because some variables are not available in the BigQuery export.
with prep as (
select
    user_pseudo_id,
    (select value.int_value from unnest(event_params) where key = 'ga_session_id') as session_id,
    array_agg((select value.string_value from unnest(event_params) where key = 'source') ignore nulls order by event_timestamp)[safe_offset(0)] as source,
    array_agg((select value.string_value from unnest(event_params) where key = 'medium') ignore nulls order by event_timestamp)[safe_offset(0)] as medium,
    array_agg((select value.string_value from unnest(event_params) where key = 'campaign') ignore nulls order by event_timestamp)[safe_offset(0)] as campaign
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20240701'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))
group by
    user_pseudo_id,
    session_id)

select
    -- session default channel grouping (dimension | the channel group associated with a session) 
  case
    when (source is null or source = '(direct)') and (medium is null or medium in ('(not set)', '(none)')) then 'Direct'
    when campaign like '%cross-network%' then 'Cross-network'
    when (regexp_contains(source,'alibaba|amazon|google shopping|shopify|etsy|ebay|stripe|walmart')
    or regexp_contains(campaign, '^(.*(([^a-df-z]|^)shop|shopping).*)$'))
  and regexp_contains(medium, '^(.*cp.*|ppc|retargeting|paid.*)$') then 'Paid Shopping'
    when regexp_contains(source,'baidu|bing|duckduckgo|ecosia|google|yahoo|yandex') and regexp_contains(medium,'^(.*cp.*|ppc|retargeting|paid.*)$') then 'Paid Search'
    when regexp_contains(source,'badoo|facebook|fb|instagram|linkedin|pinterest|tiktok|twitter|whatsapp')
  and regexp_contains(medium,'^(.*cp.*|ppc|retargeting|paid.*)$') then 'Paid Social'
    when regexp_contains(source,'dailymotion|disneyplus|netflix|youtube|vimeo|twitch|vimeo|youtube') and regexp_contains(medium,'^(.*cp.*|ppc|retargeting|paid.*)$') then 'Paid Video'
    when medium in ('display','banner','expandable','interstitial','cpm') then 'Display'
    when regexp_contains(medium,'^(.*cp.*|ppc|retargeting|paid.*)$') then 'Paid Other'
    when regexp_contains(source,'alibaba|amazon|google shopping|shopify|etsy|ebay|stripe|walmart') or regexp_contains(campaign, '^(.*(([^a-df-z]|^)shop|shopping).*)$') then 'Organic Shopping'
    when regexp_contains(source,'badoo|facebook|fb|instagram|linkedin|pinterest|tiktok|twitter|whatsapp') or medium in ('social','social-network','social-media','sm','social network','social media') then 'Organic Social'
    when regexp_contains(source,'dailymotion|disneyplus|netflix|youtube|vimeo|twitch|vimeo|youtube') or regexp_contains(medium,'^(.*video.*)$') then 'Organic Video'
    when regexp_contains(source,'baidu|bing|duckduckgo|ecosia|google|yahoo|yandex') or medium = 'organic' then 'Organic Search'
    when medium in ('referral','app','link') then 'Referral'
    when regexp_contains(source,'email|e-mail|e_mail|e mail') or regexp_contains(medium,'email|e-mail|e_mail|e mail') then 'Email'
    when medium = 'affiliate' then 'Affiliates'
    when medium = 'audio' then 'Audio'
    when source = 'sms' or medium = 'sms' then 'SMS'
    when medium like '%push' or regexp_contains(medium,'mobile|notification') or source = 'firebase' then 'mobile push notifications'
    else 'Unassigned'
end
  as channel_grouping_session,
    count(distinct concat(user_pseudo_id,session_id)) as sessions
from
    prep
group by
    channel_grouping_session
order by
    sessions desc
  • page referrer (based on user)
with prep as (
select
    user_pseudo_id,
    (select value.int_value from unnest(event_params) where key = 'ga_session_id') as session_id,
    max((select value.string_value from unnest(event_params) where key = 'page_referrer')) as page_referrer
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20240701'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))
group by
    user_pseudo_id,
    session_id),

rank as (
select
    user_pseudo_id,
    session_id,
    page_referrer,
    rank() over (partition by user_pseudo_id order by session_id) as rank
from
    prep
qualify
    rank = 1)

select
    -- user page referrer (dimension | the full referring url of a user's first session (within selected date range), including the hostname and path)
    coalesce(page_referrer,'(not set)') as page_referrer_user,
    count(distinct user_pseudo_id) as users
from
    rank
group by
    page_referrer_user
order by
    users desc
  • page referrer (based on session)
with prep as (
select
    user_pseudo_id,
    (select value.int_value from unnest(event_params) where key = 'ga_session_id') as session_id,
    max((select value.string_value from unnest(event_params) where key = 'page_referrer')) as page_referrer
from
    -- change this to your google analytics 4 export location in bigquery
    `ga4bigquery.analytics_250794857.events_*`
where
    -- define static and/or dynamic start and end date
    _table_suffix between '20240701'
    and format_date('%Y%m%d',date_sub(current_date(), interval 1 day))
group by
    user_pseudo_id,
    session_id)

select
    -- session page referrer (dimension | the full referring url of a session, including the hostname and path)
    coalesce(page_referrer,'(not set)') as page_referrer_session,
    count(distinct concat(user_pseudo_id,session_id)) as sessions
from
    prep
group by
    page_referrer_session
order by
    sessions desc