LogoLogo
  • Datawisp Documentation
  • Datawisp Overview
  • Adding your data
    • Connect to a database
      • Snowflake
      • BigQuery
      • Google Analytics via BigQuery
      • MotherDuck
    • Tips for Success
    • Troubleshooting
    • Reloading the database schema
    • Data Dictionaries
  • Analyzing Data
    • Datawisp Blocks
      • Data Source
      • Summarize & Group
      • Combine
      • Order and Select Rows
      • Table
      • Filter
      • Add Column
      • Cleanup
      • JSON
      • Chart
      • Reshape
    • Wispy
      • The SQL View
  • Deployment
  • Security
  • Privacy
  • Data processing
  • Contact Info
  • Additional Resources
    • Tutorials
Powered by GitBook
LogoLogo

Datawisp

  • Main Site
  • App

Social

  • X
  • Linkedin

Legal

  • Privacy Policy
  • Terms of Use

Copyright 2023 Datawisp, Inc.

On this page
  • Create a view in BigQuery
  • Grant Access Permissions
  1. Adding your data
  2. Connect to a database

Google Analytics via BigQuery

PreviousBigQueryNextMotherDuck

Last updated 8 months ago

When connecting Google Analytics (GA) to BigQuery, your data is stored in a format known as sharding. Table sharding is the practice of storing data in multiple tables, using a naming prefix such as [PREFIX]_YYYYMMDD. This presents a challenge when doing analytics on the data, as the data is not in a single table (the BigQuery documentation also ).

When importing this directly into Datawisp, Datawisp would treat each shard as a separate table -- this would lead to a very poor experience overall as you end up with hundreds of tables when you logically only really want one.

In order to solve this, we have created this guide to show you how to create a view in BigQuery that will present the data in a single table to Datawisp.

Create a view in BigQuery

First, we need to create a view in BigQuery that will present the data in a single table to Datawisp. Open the BigQuery console create a new SQL query.

There, paste the following SQL statements:

CREATE OR REPLACE VIEW `<dataset>.ga_events` AS 
SELECT
  * EXCEPT(event_params),
  parse_json((SELECT 
      CONCAT(
      '{',
        STRING_AGG(
          CONCAT(
            TO_JSON_STRING(ep.key),
            ': ',
            COALESCE(
              NULLIF(TO_JSON_STRING(ep.value.int_value), 'null'),
              NULLIF(TO_JSON_STRING(ep.value.string_value), 'null'),
              NULLIF(TO_JSON_STRING(ep.value.double_value), 'null'),
              NULLIF(TO_JSON_STRING(ep.value.float_value), 'null'),
              'null'
            )
          )
        ),
      '}'
    )
    FROM UNNEST(event_params) AS ep
  ), wide_number_mode=>'round') as event_params
FROM `<dataset>.events_*`;
CREATE OR REPLACE VIEW `<dataset>.ga_users` AS select * from `<dataset>.users_*`;
CREATE OR REPLACE VIEW `<dataset>.ga_pseudonymous_users` AS select * from `<dataset>.pseudonymous_users_*`;

Google Analytics supports exporting both Event Data (events_YYYYMMDD) and User Data (users_YYYYMMDD, pseudonymous_users_YYYYMMDD).

This guide shows you how to import all of those tables into Datawisp. If you don't have some of them (e.g. because you did not enable the export of User Data), simply omit the respective steps.

For example, if you don't have any events_YYYYMMDD tables, simply remove the first CREATE VIEW statement in the above SQL snippet.

Make sure to replace <dataset> with the name of your dataset.

Then press Run.

Grant Access Permissions

Then, on the dataset go to Sharing -> Authorize Views.

Authorize the views that you created in the previous step.

Then, on each of the views, grant the BigQuery Data Viewer role.

This setup effectively hides all the sharded tables from Datawisp and only grants access to the views.

Datawisp will now be able to import your GA data correctly.

At this point the GA-specific preparations are complete, so you can now get started with the generic .

Follow this guide until you reach the step . At this point, it is important that you do not grant the BigQuery Data Viewer role on the entire dataset to the service account. Grant the BigQuery Metadata Viewer role instead.

Then, proceed with the of the guide.

BigQuery import guide
recommends against using table sharding
Grant Access to the Dataset to the Service Account
rest