How (and why) to activate the GA4 user activity data export to BigQuery
Whereas all export data was event based until recently, from now on you can also obtain user based export data in BigQuery. Here will show you how to activate this user export. We assume you already have set up a connection to export your GA4 event data.
As already announced in our newsletter, a new GA4 export has seen the light. Whereas all export data was event based until recently, from now on you can also obtain user based export data in BigQuery. Here will show you how to activate this user export. We assume you already have set up a connection to export your GA4 event data.
What is in there?
When you activate the user data export, you will get two new tables in your analytics_***
data set.
- a table
pseudonymous_users_*
(where*
is the date inYYYYMMDD
) with all data regarding the defaultuser_pseudo_id
identifier- every row represents one
user_pseudo_id
- the row is updated when there is a change to one of the fields
- data for unconsented users is not exported to this table
- the
user_id
field is not available in this table - every row contains a timestamp that indicates the last moment a pseudonymous user was active
- every row represents one
- a table
users_*
(where*
is the date inYYYYMMDD
) with all data regarding the optionaluser_id
identifier- every row represents one
user_id
- the row is updated when there is a change to one of the fields
- data for unconsented users can be exported to this table if it includes a
user_id
- the
user_pseudo_id
field is not available in this table - every row contains a timestamp that indicates the last moment a user was active
- every row represents one
Why does it matter?
Most information that is in the user data export can be retrieved using the event data export that is already available in your BigQuery project. However, it will make the learning curve to access user (attribute) data a bit less steep.
Good examples of data that was already available are the user properties, user geographical information and user device data (directly), and user lifetime (LTV) statistics (indirectly).
But the most exciting part of this feature is the user data that you could not retrieve using the event data export: audience and prediction data.
Let's take a look at the new fields for those attributes.
User scoped audience data
As you can see in the schema below you can get information about which users are (or were) part of any audience you have set up in GA4. This will enable you to use and activate audience data outside of the Google ecosystem (e.g. to target users on other advertising platforms).
Field name | Data type | Description |
---|---|---|
audiences | RECORD | Audience information |
audiences.id | INTEGER | ID of the audience |
audiences.name | STRING | Name of the audience |
audiences.membership_start_timestamp_micros | INTEGER | When the user was first included in the audience (timestamp in microseconds) |
audiences.membership_expiry_timestamp_micros | INTEGER | When the user's audience membership will expire (timestamp in microseconds). Membership duration is reset when new activity requalifies the user for the audience |
audience.npa | BOOLEAN | true or false based on your NPA settings for events and user-scoped custom dimensions included in your audience definition |
User scoped prediction data
For each user Google also added some prediction data. By default it calculates for each user the purchase_score_7d
(both for app and web), a churn prediction score and the expected revenue in USD.
Field name | Data type | Description |
---|---|---|
predictions | RECORD | Prediction information |
predictions.in_app_purchase_score_7d | DOUBLE | Probability that a user who was active in the last 28 days will log an in_app_purchase event within the next 7 days |
predictions.purchase_score_7d | DOUBLE | Probability that a user who was active in the last 28 days will log a purchase event within the next 7 days |
predictions.churn_score_7d | DOUBLE | Probability that a user who was active on your app or site within the last 7 days will not be active within the next 7 days |
predictions.revenue_28d_in_usd | FLOAT | Revenue expected (in USD) from all purchase events within the next 28 days from a user who was active in the last 28 days |
Limitations
As always, there are some limitations to this feature.
- the export doesn't provide a historical backfill of the user data (e.g. for a reliable count of all users you still need the event data schema)
- as the user data export contains data that is already heavily processed by Google, it is to be expected that there will be differences if you compare user data from the user export with user data based on the event export
How to activate the export
Brace yourself for the shortest tutorial ever on this platform.
Navigate to the GA4 admin section in the user interface. Click BigQuery links
and select your existing BigQuery link configuration.
If the feature is already rolled out to your property, you will see a new checkbox at the bottom of the page.
Select the Daily
checkbox and click Save
. You're all set, but it will take some time until the next batch of user data is exported to BigQuery, as this only happens once a day.
Now it's your turn!
I hope you've enjoyed this tutorial and feel a bit more confident to utilise your own Google Analytics data in BigQuery. Drop a line in the comments if you have any questions, feedback or suggestions related to this article.