I’ve been spending a lot of my time working in Power BI lately.
Power BI allows me to quickly connect data that would have taken hours of stacking vlookups in spreadsheets. I can find new opportunities looking at the same data through multiple lenses. Power BI is a great data visualization tool to spot things I might not have seen in my spreadsheet or a line chart.
All this sounds cool, right? It is, but when you’re first starting out and learning about creating relationships in Power BI, running into duplicates means that you might not be able to properly connect your data.
Below I’ve outlined my quick tips for finding and removing duplicates from your data, as well as cleaning and trimming your data.
If you're looking for more Power BI tips, check out our comprehensive guide.
This will allow you to create a list of unique values and prepare to create a connection or build a bridge with those values. I automatically apply these transformations onto any column that I’m prepping to clean and remove duplicates.
How to Find & Remove Duplicates in Power BI
We’ve all been there ...
The dreaded “you’ve got duplicates” popup when trying to create a relationship.
Luckily, it’s easy to quickly identify duplicates within a column. There are some steps to remove and prevent duplication when creating relationships or building a bridge between tables:
Create a quick table with “Value” and “Count of Value” to easily identify duplicates
TRANSFORM > LOWERCASE
=LOWER transforms all of your values into lowercase.
- Right-click your subject column and select transform > lowercase from the dropdown.
TRANSFORM > CLEAN
=CLEAN returns the original text value with non-printable characters removed.
- Right-click your subject column and select transform > clean from the dropdown.
TRANSFORM > TRIM
=TRIM removes whitespace values from the beginning and end of the text value.
- Right-click your subject column and select transform > trim from the dropdown.
REMOVE ROWS > REMOVE BLANK ROWS
- With your subject column selected, click on the remove rows icon, then select remove blank rows.
REMOVE ROWS > REMOVE ERRORS
- With your subject column selected, click on the remove rows icon, then select remove errors.
REMOVE ROWS > REMOVE DUPLICATES
- With your subject column selected, click on the remove rows icon, then select remove duplicates.
If you remember to lowercase, clean, trim, remove blanks, remove errors, and remove duplicates you’ll be on your way to creating a unique list of values and build new relationships between all types of data!
In addition to this guide, watch the video tutorial to cleaning URLs in Power BI: