Using Regular Expressions to Group Keywords in Google Analytics
Earlier this week, SEER held a “Google Analytics 101″ training for our interns. Questions were asked about RegEx, and I ended up sending an email out to the team with my most commonly used Regular Expressions, as used for Keyword Analysis within Google Analytics. Below are the five Regular Expressions shared with the team:
1) | (pipe) – OR. This is the one youll need 98% of the time!!
Looking at SEER’s Google Analytics profile, many of our terms come from people searching for “SEER” or for “Wil.” To see searches that contain either “Wil” or “SEER,” use the pipe: seer|wil Everything that contains either of these two terms will come back.
2) ^ (carat) – Starts with.
^seer finds keywords that start with “seer,” so SEER Interactive would match, but thinkseer would not.
3) $ (dollar sign) – Ends with. This is the opposite of the carat.
interactive$ finds keywords that end with “interactive,” so “SEER Interactive” would match, but “SEER Interactive Philadelphia” would not.
4) ? (question mark) – Last character can be ignored.
There are almost always going to be variations in how people search for your brand name. I see a lot of people searching for us as “SEER Interactive” but others searching for “seerinteractive.” I want to catch both, and the question mark makes that easy. Using seer ?interactive tells GA that we want all instances of “SEER Interactive” with or without the space.
5) + (plus sign) – Last character can be repeated.
Know how many people have searched for Wil as “Will” in the last month? Whether they spell it “Wil” or “Will,” the searcher is still looking for the same person. Using the plus sign can help us here. wil+ will find the person searching properly for “Wil” but will also catch “Will,” or even “Willlll.”
These can be paired up, made into Includes or Excludes, used for advanced filters, and utilized for segments. As an example, if I wanted to look at people searching for “SEER Interactive” or “ThinkSEER,” both with and without spaces, with out any longtail, and with people who accidentally used 1 or 3 “e’s” in SEER, I’d use this: ^se+r ?interactive$|^think ?se+r$
Please note – This post is intended to be basic. There’s a lot more that can be done in GA with RegEx. If you’re looking for additional information (and can’t wait for our next blog post on the subject!), there are some phenomenal comprehensive posts on RegEx. If you already have a favorite RegEx post, please share!
Follow Rachael on Twitter @RachaelGerson