Analytics

How To Find & Fix These 3 Red Flags In Your Google Analytics Data

Through Google Analytics tracking you can gain valuable insights into who is coming to your site, how they are using it and whether or not your website is doing its job.

While implementing Google Analytics tracking is the first step toward making smart business decisions based on data, it’s really just the first step.

Without regular attention and maintenance, your business is either blindly relying on its Google Analytics data (scary), or is likely not using it at all (sad).

But fear not! Here are three common issues and how to fix them to restore data integrity.

1. Self Referrals

This is a biggie. Google’s definition of a self-referral is referral traffic that originates from pages within your own domain.

The Impact

A self referral means that a session for a single user has broken and restarted without actually leaving your site. Google Analytics reads your domain as the referrer, effectively overwriting actual acquisition information.

Why is this happening?

Self referrals are caused by errors in your tracking implementation such as missing tracking code, an absent or inaccurate cross domain setup, internal link tagging with utm parameters and more.

How can I check my data?

Within Google Analytics, navigate to the Referrals report (Acquisition > Referrals). Look for your own web properties within the report. Self referrals may include your main domain (e.g. yoursite.com) and other tracked web properties (e.g. blog.yoursite.com).

How can I fix it?

Connect with the person managing your tracking implementation. Self-referrals can be addressed through configuring tracking on an untracked company web property (if people can click from your site to it, it should be tracking). Adjustments could also be needed to your cross domain implementation (miss a site?) or even removing instances of utm parameters on internal website links (clicking on utm parameters registers a new session, losing previous referring information).

Do not be dazzled by the promise of Universal Analytics’ referral exclusion list. While adding an internal hostname to the list may prevent session breaking in for that hostname, it will also effectively mask broken tracking on your website that needs attention.  

You can also check out Google’s resource on self referrals, which explains common causes in greater detail and suggests methods to help you detect self referrals in your data.

2. Personal Identifiable Information (PII)

PII is anything about a user that could be used to identify an individual. It may include (but is not limited to) names, addresses, email addresses, personal phone numbers, credit card information and social security numbers.

PII within your Google Analytics data is serious because Google’s policy prohibits it, threatening that accounts containing PII can be permanently closed (meaning the loss of all of your data).

Why is this happening?

PII typically reaches your Google Analytics data through url strings. Consider features on your website, such as form fields and search bars, where personal user information entries are passed through the url without encryption.

How can I check my data?

Navigate to the pages report (Behavior > Site Content > All Pages), begin an advanced search pasting one of the following regular expressions into the search field.

Search for phone numbers:

\?.*([=:,!]|%2[1C])(\(|%28)?\d{3}([\s+.,)-]|%2[0B1C9])*\d{3}([\s+.,-]|%2[0B1C])*\d{4}([\s+]|%2[0B])*($|[&#:,!%])


Search for email addresses:

\?.*(@|%40)

Search for physical addresses:

\?.*\b(St(reet)?|Ave(nue)?|B(ou)?le?v(ar)?d|(High)?Way|Ln|Lane|Road|Rd)\b

Search for zip code:

\?.*([=:,!]|%2[1C])\d{5}(\s|\+|%2[0B])*-(\s|\+|%2[0B])*\d{4}($|[&#:,!%])

Search for credit card information (Visa/MasterCard/Discover):

\?.*([=:,!]|%2[1C])(4[0-9]|5[1-5]|2[2-7]|6[05])(([\s+.,-]|%2[0B1C])*\d){12}($|[&#:,!%])

Search for Social Security Number:

\?.*([=:,!]|%2[1C])d{3}-?\d{2}-?\d{4}($|[&#:,!%])

How can I fix it?

If PII is identified get in touch with your developer. Share the parameters you’ve identified sending the information into Google Analytics and request values currently passing with PII are encrypted.

If addressing at the source is not an option, you can implement a solution through Google Tag Manager to strip parameters from the url before it makes it to Google Analytics by following these steps:

  1. Create a custom javascript variable in Google Tag Manager called ‘Pageview URL – Custom’ using this javascript shared by Seer team member Stephen Harris.
  2. Create a test Google Analytics property this will allow you to validate the solution before rolling out to your active Google Analytics property.
  3. Create a test pageview tag pointing to a test UA property. Within this tag, set the ‘page’ field with your new variable.
  4. Test and publish the tag
  5. Review urls in Google Analytics in your test property to confirm if your solution is functioning as expected.
  6. If functioning as expected, complete the configuration for your live pageview tag and publish.

In addition to the URL, PII can also be inadvertently sent into Google Analytics through event and custom dimension values. If you are not familiar with your tracking implementation and data, these are important to check as well.

3. Referral Spam

Referral spam are non-genuine traffic sources polluting your Google Analytics data.

The Impact

Referral spam can send fake data into your Google Analytics account, with or without actually visiting your website resulting is non genuine session and user data or inflated sessions, impacting any metric calculated using sessions such as bounce rate, pages per session, avg. session duration, etc.

Why is this happening?

Referral spam can be the result of ‘Ghost’ or fake data being sent to Google Analytics’ servers (likely targeting GA tracking IDs at random) through the Measurement Protocol. It can also be a result of ‘Crawlers’ or people using bots to crawl websites without making the effort to block their activity from Analytics.

How can I check my data?

To check your own data, navigate to the Hostname report (Audience > Technology > Network > Select Hostname). If you see a hostname you do not recognize or ‘(not set)’ listed, this is likely ‘Ghost’ Referral Spam.

‘Ghost’ spam will typically be sent with a made up hostname or none at all. Because this traffic never actually made it to your site, you’ll notice the traffic has a 100% bounce rate, an average of 1 page per session and 0.00 seconds for average session duration.

Referral Spam can also be detected in the Referrals report. Navigate to Acquisition > All Traffic > Referrals, look for sources you do not recognize. Something I’ve learned about Referral Spam is that sources are rarely new. So if you do not recognize a particular referral traffic source, Google it: ‘example_source referral spam’. Between the results and your knowledge of expected traffic sources to your website, you should be able to discern non-genuine traffic sources for your website.

How can I fix it?

If you’ve identified Referral Spam sources through unrecognized hostnames or referral sources, you can configure Google Analytics to exclude it through Filters.

For ‘Ghost’ Referral Spam hostnames

  1. Navigate to Filters (Admin > All Filters at Account level).
  2. Create a new filter, select ‘Add Filter’
  3. Choose Custom
  4. Select ‘Include’
  5. Select ‘Hostname’ from the drop down
  6. Specify all Hostnames that should be sending data in to GA (those that host your tracker).

IMPORTANT: If you have multiple Hostnames, Regular Expression must be used to specify all Hostnames in a single include filter. Regular Expression is a powerful language you should read up on, for the purposes of this Filter you’ll use the Escape (‘\’) and Or (‘|’) characters.

For example:

www\.mysite\.com|blog\.mysite\.com|press\.mysite\.com

       7. Apply it to relevant views of your data

For ‘Crawler’ Referral Spam

       8.   Using a Custom Filter

9.  Select ‘Exclude’

10.  Chose ‘Campaign Source’ from the drop-down

11.   Enter the Source you want to exclude

12. Apply it to relevant views of your data

Monitoring your data on a regular basis is the key to ensuring quick fixes to inevitable threats to data accuracy in your Google Analytics account.

How are you monitoring your data? We’d love to hear from you! If you’re interested in getting google analytics help, drop us a line