How to fix 6 common data hygiene issues
It’s likely that your nonprofit organization has some data hygiene issues due to inconsistent or erroneous data entry over time. Lack of time, lack of training, staff turnover, and human error can compound to create quite a mess.
Data hygiene problems tend to fall into one of the following six categories. Each has their own challenges, and their own methods for fixing. When you are evaluating your data hygiene issues, think through what kind of issue you are dealing with. That will help you come up with your solution.
Ambiguous data is the same code used for two different types of data, like “board” referring to both current and former board members. To fix ambiguous data, identify how each type of data should be coded moving forward, e.g. current board members should be coded “current board” and former should be coded “former board”. Then review all of the records with the ambiguous coding and update as needed. Your global update and import tools may be of assistance in this endeavor.
Duplicate person and/or organization records are very common in databases. Implement an at-least quarterly de-duplication process to stay on top of this issue. Your database should have a merge tool that you can use to review duplicate records side by side, and decide which ones to merge, and which data to keep. Some organizations also end up with duplicate gift records, which must be reviewed and updated by hand. With some intermediate Excel skills, you’ll look for gift transactions in the same amount, on the same date, from the same donor.
Inconsistent data is the same data coded in two different ways. For example, perhaps some of your event attendance data is stored in your events module, and some is stored as a tag on the donor record. To fix inconsistent data, identify the correct method for tracking the data moving forward. Then review the records that are coded using the incorrect method and update them as needed.
Incorrect data is just plain wrong. If you suspect that some of the data in your system is incorrect, then identify a threshold for data you’ll review, e.g. all donors who’ve cumulatively given more than a certain amount, or all gift transactions of a certain amount or higher. Then pull reports based on your threshold criteria, and review them manually to identify data to correct. It may require research within your organization or using third-party sources to update this data. For example, you may need to do some online research to correct incorrect addresses. If you suspect that gifts have been coded with incorrect solicitation coding, you may need to rely on staff in your organization to review and correct them.
Sometimes data is not in its correct home in your database. This is very common with free text fields, which often end up containing a lot of data that would be more reportable if tracked in more structured fields. For example, perhaps instead of using communication preferences to track who would prefer not to receive mail, this information has been noted in a donor notes field. To fix this problem, review the misplaced data as needed, move the data into its correct location, and then delete the data from its old location.
This can be the hardest problem to fix! Missing data just isn’t there at all. If you have other data sources that have the data you need, e.g. an old database, a third-party system, then you are in luck. Otherwise, you may have to rely on old documents and anecdotal information from staff and volunteers.
If you’re dealing with data hygiene issues, check out our previous blog post, which focused on how to prioritize among multiple data problems.
Join Fundraising Nerd's mailing list for subscriber-only exclusives, like updates on IRS and other regulatory changes that impact donor data management, free resources like white papers and other downloads, and invitations to beta test new learning materials at beta prices. And if you’d like to make sure you never miss a blog post, sign up here to receive an email each time we publish.