What do you do when your site is leaking user information into search results? Well, let’s take a step back and look at what kind of personal information this could be:
- Form information that contains first name, last name, email address
- Actual address information, including zip
- Unsubscribe confirmation URLs that contain email addresses and/or names
- Online ordering information like what recently happened to Panera
There are a number of ways this information could be seeping into Google’s index. The good news is that we’ve got you covered on how to fix most cases of this below. The bad news? If you don’t take care of it and your customers find out:
Best case: You don’t fix it and you lose some customers
Worst case: Your customers are scammed through the information provided, a lawsuit, and Google removes your analytics data because it contains personal identifying information. You also get dinged by the new GDPR EU rules and have to pay a hefty penalty. And deal with the negative press.
Google provides some best practices on how to not have this happen, but if you’re reading this, you might be beyond the prevention phase and need the removal solution first. You're in luck, Paige Flanagan wrote a post on how to find and address these in Google Analytics.
Even if you remove PII from SERPs, PII in your GA data is a violation of GA's Terms of Service and puts you at risk of having your analytics data deleted, so don't gloss over this one if you use GA! Her second point is spot on for this, so don’t just take my word for it. Go read and come back here. (We’ll even open the link in a new tab for you.)
Scenario One: Your Website Has Under 20 URLs Indexed
You’ve found that a handful of URLs containing personal identifying information have slipped through. No sweat (for the moment), easiest way to get rid of those for at least 90 days until you plug that hole is by submitting to Google Search Console.
1) Go to Google Search Console
2) Click on the Google Index dropdown and select “Remove URLs”
3) Click “Temporarily Hide” and add the URLs that are indexed
*WARNING: Step 3 has the ability to deindex your entire website by only entering “/” ... DO NOT DO THIS*
Scenario Two: Your Website Has Hundreds or Thousands of URLs Indexed
If you’ve found that hundreds or even thousands of pages are live that contain user information, don’t panic just yet. Here’s a few ways to make sure it’s removed.
Template Level Removal
It’s very likely the pages you’re looking to remove are part of a template where a “noindex” meta robots can be added. Here are instructions to add a meta robots “noindex” tag to a page.
The full article can be found on this Google support page.
If there is no pattern for how these are appearing, there’s the very manual process of hardcoding a noindex tag into each page. While tedious, just one page being live could result in a GDPR penalty or users being phished.
How much do you think a company like, say, Panera Bread, would pay to make that story go away? Likely the cost of an intern going in an adding these for 2 days worth of work - better to be safe vs sorry.