Ever Wonder Why Acronyms Look Weird in Your Database? (and how to fix them… )

Maybe I’m a little OCD but when I look at a database of account names and find capitalization inconsistencies it leads me to think that those people responsible for data quality aren’t doing a very good job.  Users who add new accounts to the database are given implied permission to be less concerned about the quality of the data they add.  Law enforcement officers call this phenomenon the “broken window theory.”  Questionable data quality encourages more questionable data quality.  If users can point to this as the acceptable level of quality, they can call the accuracy of the entire database into question and before long sales reps use this as one of their excuses for not making their number!

We’ve all seen account names where the abbreviation of their names are mixed case, (e.g. Mtv, Rbs, Cvs, Nbc, etc.). Why are these upper and lower case errors so common?  The answer goes back decades when computers only used upper case. But, users needed upper and lower case as word processing merged with everyday computing.  Someone developed a function that converted the upper case names to a mix of upper and lower case based on the first letter of each word in the text string.  In Excel this function is called PROPER and it will convert MTV NETWORKS COMPANY to Mtv Networks Company or simply MTV to Mtv.  I believe widespread use of PROPER and functions like it are the primary reason you see case errors in account names today.  It is also responsible for incorrect capitalization of prepositions as in “City Of Miami” but that issue is simple to fix with the Find and Replace function in Excel. Just make sure you put a space before and after the “Of” when searching or bad things will happen…

Some organizations simply address the acronym challenge by putting a space between the letters, e.g. M T V instead of MTV but I’m not a fan of that compromise because OCD.

Excel actually has a number of text handling functions which we can use below in a couple of formulae to identify and repair the case errors in your database.  Before we get into the formulae let me set expectations properly. 

There is no simple formulaic solution that corrects capitalization of acronyms without human participation.  You will have to manually review about 20% of your accounts because they have two, three or four letters in the first word of their name.  It takes human intervention to determine if those two, three or four letters are a name, a word or should be an acronym.  The formula I use makes this manual job a whole lot easier assuming you have basic administrator ETL skills with your data set.

The first step is to identify which account names in your database need that human intervention.  Extract the account names and a corresponding account ID number (so you can upload the corrected name back into your database) into an Excel worksheet in columns B (name) and C (ID).  Label column A “Qualified”. Copy and paste the below formula into cell A2 and fill down to the bottom of the list of account names.

=IF(AND(IFERROR(FIND(” “,B2,1)-1,LEN(B2))>=2,IFERROR(FIND(” “,B2,1)-1,LEN(B2))<=4),”Check”,””)

The formula will resolve to “Check” for those account names having two, three or four characters in the first word of their name.  Now sort the sheet by column A and column B.  This will group together all those account names needing to be checked.  I’m suggesting you sort by both the formula result and the account name since this is a good time to also look for account name duplicates. Plus having the same words sorted together will allow you to skip over those names that don’t need to be changed much faster.

Now label column D “Corrected Account Name” and column E “Flag”.  Copy this next formula into cell D2 and fill down to the bottom of the list of names where the first formula resolved to “Check”.

=IF(E2=””,B2,UPPER(LEFT(B2,IFERROR(FIND(” “,B2,1)-1,LEN(B2))))&MID(B2,IFERROR(FIND(” “,B2,1),LEN(B2)),LEN(B2)-IFERROR(FIND(” “,B2,1)-1,LEN(B2))))

Begin your manual review of the list and when you find an acronym that should be capitalized in the account name enter any character (e.g. “x”) in the corresponding row in column E and the name will be corrected in column D.  When you have completed the review, select column D and copy > paste special > values to convert the account names into text so they can be uploaded to your database.

So, while this method has worked for me over the years I’m interested if any of my Sales or Marketing Ops readers have a more efficient approach.  Please enlighten us below and thanks for reading!

About Bob Bacon

I work with global B2B high tech Sales leaders to help them enable and optimize the effectiveness of their organization Find out more about Bob here: http://bobbacon.net/blog/about/
This entry was posted in Sales Ops and tagged , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *


This site uses Akismet to reduce spam. Learn how your comment data is processed.