Google Data Studio, a free online data visualization tool, connects to a multitude of data sources including Google Analytics.
If you have a very high volume of Google Analytics data, and use Data Studio for reporting and analysis, it’s possible you may experience data discrepancies between what the Google Analytics interface displays and what you see in Data Studio. If you have this issue, and are positive you are using the same parameters (think: filters, segments, scope, date range, etc.) in GDS as you are GA, check for data sampling.
Although there are many reasons Google Analytics may not align 1:1 with another platform you use to measure your goals & KPIs, data sampling is one of the most critical to be aware of (maybe outside of something like scope) to make the best decisions with your data.
If you haven’t read our comprehensive guide to understanding and solving Google Analytics data sampling, I suggest reading that post first here.
Google Data Studio uses the same sampling behavior as Google Analytics. This means that if a chart in Data Studio creates an ad hoc request for data in Google Analytics, standard sampling rules will come into play. And keep in mind: you can’t change the Google Analytics sampling rate in Data Studio.
Although sampling is often rare and triggered by an incredibly high volume of data (hits), it’s not impossible to experience sampling in the Google Data Studio interface if you have standard GA.
GA 360 accounts, however, have a much higher sampling rate for Analytics 360 users who use the GA connector to add their GA 360 data to Data Studio. Still, depending on the complexity of the data you’re using in your reports, there have been cases where GA 360 users are still sampled in Data Studio.
The only way to truly know if your Data Studio report is affected by Google Analytics sampling is to check for the indicator.
If your dashboard is sampling, the “Show Sampling” link will appear in the report footer on any page that contains charts based on Google Analytics. This link is available to all viewers of the report and there is no way to hide it.
Clicking the “Show Sampling” link will place a highlighted box around any charts that contain sampled Google Analytics data. The sampling rate is displayed above each chart. To remove the highlight, click “Hide Sampling”.
If so, what are they and how are they configured? It’s likely you will experience sampling in either GA or GA 360 if your data volume is high AND you are applying a custom filter or segment with multiple conditions.
For example, if you have a high volume of data for your given date range, and use a filter with conditions such as:
- Channel exactly matches ‘Organic Search’ or ‘Direct’
- Completed 1 or more conversions
- Landed on a page that contains ‘blog’
… you will likely be sampled in Data Studio if you are connecting to the standard (free version) of Google Analytics.
If that’s the case, your best next step is to consider upgrading to GA 360 from standard GA. Get all the details here.
Sampling is typically the most evident when comparing scorecard data to data within chart visuals. That’s because Scorecards are inherently reliant on the Date Range of the Data Source in order to do the math between the current and previous date ranges.
So, if you are used to being sampled in GA when you compare your date ranges, this will be no different in Data Studio and your scorecard will display inaccurate data vs the total of each time period in GA.
Again, if you are being sampled when comparing date ranges in GA, you will 9 times out of 10 be sampled in Data Studio – within scorecards especially.
It’s easy to assume sampling when you notice data discrepancies between your Data Studio report and the Google Analytics interface. However, depending on your familiarity and experience with Google Data Studio, it’s more likely that there are other factors interfering with the accuracy of your reports’ data.
For example, I’ve seen many Data Studio users that use a CASE statement to clean data by grouping what’s clean from what isn’t. However, this is not best practice! CASE statements are not designed to clean data or manipulate it. CASE statements are functions that group the same dimensions or metrics by logical rules.
You can learn more about CASE statements here.
Using CASE this way may prompt GDS to return incorrect data without giving you any warning. That’s why I always recommend testing your knowledge of GDS fundamentals (by taking the free course from Google on the Analytics Academy here) before diving into the more advanced functions like CASE.
If you still have problems with GDS not matching GA data, then you might want to hit up the Google Data Studio Help Center to further diagnose what the issue might be. Hopefully you’ve found this helpful, and can take further steps to avoid sampling in Data Studio by being more aware of how things work in the platform.
If you feel that the benefits of Google Analytics 360 are a good fit for your company, you can contact a Google Partner (like us at Seer) to upgrade to the enterprise version, Google Analytics 360 (a part of the Google Marketing Platform). Fill out the interest form below to talk data to us.
Leave your questions or comments below, and don’t forget to read our Analytics team’s guide to Google Analytics data sampling here.