<img src="https://secure.leadforensics.com/75129.png" style="display:none;">
Talk to Sales

Talent Data C Words Part 4:  Clean

July 18, 2019
This post is the fourth of a 7 part series on mastering Talent Data - The 7 C's of Talent Data!

We have already covered the importance of our first three C’s which you can read all about in our earlier posts. Connected, current, and convenient.

In this post, we're talking about clean data - and you should forget what you've already heard about it!  No more huge projects of manual data review and cleaning, let's see how we can make this much more seamless.

As all HR, recruiting and analytics professionals know, if you don't have clean and accurate data, staying competitive in this talent marketplace becomes nearly impossible. 

  • Poor data hygiene means that you’re handicapping your recruiting team before they even have a chance to build the kinds of relationships and talent nurturing funnels that are crucial to talent acquisition.
  • Dirty data can also significantly impact your employee and candidate experience, and therefore your talent brand.
  • Bad data REALLY hurts your workforce analytics and people analytics.

When it comes to CLEAN data, there are three areas to address:

  • Duplicates
  • Errors
  • Generally messy data
  • Data decay

We talked about data decay in our Current data post, but there are other things to consider for keeping your data clean. Duplicates, errors, and messy data aren’t necessarily a result of data decay. Data entry process can create human errors. Overlapping data collection systems and even your daily processes can create duplicates. And poor data management practices can lead to messy data in general (job titles, anyone?).

Duplicate data.  Data that is gathered from multiple sources or processes (which happens all the time) has a much higher chance of duplicates than single sourced data. Record duplication is one of the significant problems that data-aware businesses face, and it can impact the bottom line due to skewed projections and duplicate marketing efforts. However, the existence of multiple records about the same person (especially in the ATS) is interesting in itself. So rather than de-duping you probably should connect duplicate records. Every single detail on a record could provide a deeper insight into the person allowing you to have more robust profile and candidate information.

Data errors. Sometimes data gets crammed into other fields where it doesn't belong - perhaps in the name of expediency. Human error, migration, integration problems, system limitations; the list goes on and on. All data errors create problems when the data is analyzed later, so too many businesses lose out on crucial analytics because they are not using all of the available and up-to-date data sets. It's important that you have mechanisms in place that collect all of the relevant data that is available to you and feed this data into your talent data lake on a timely basis. Also, there is probably "hidden" data that you are missing out on, and data that has gone stale.

Messy data. What’s messy data you ask? If you extract standard models, leave old data behind, or decide that it's OK you didn't need to migrate those resumes....you're leaving BIG holes in your data.  Handily, now that you have tools that handle structured, unstructured, and highly specialized data there's no excuse to not have comprehensive, well-collected data, and it no longer creates a huge administrative or consulting burden.

Messy data also includes areas where we meant well but lost control over time.  With the best intentions, we allow teams and individuals to change what were once tightly controlled data fields, purely so they can support the business well.  Think about those huge "job title cleanup" projects we've all seen - they are just one example of how much effort it takes to keep data clean!

Idea-square-optTakeaway:  There are multitude of ways your data gets dirty or messy and you can't prevent them all

Automating your talent data management

Technology has come a long way since our first relational databases, and AI and machine learning are just two of the emerging technologies that can take away a lot of the heavy lifting of keeping data clean - or at least clean enough to be highly usable.

Here are some of the ways technology helps you keep your data clean:

  • Connect your data so that you can see everything, match fields and ensure you use the best of your data
  • Dedupe records, or even better combine them so that duplicates aren't lost but don't hurt you
  • Correlate similar values so that apples and apples are together even if you have them labeled Granny Smith and Delicious
  • Match fields across systems even if they have different names and data types
  • Stop abandoning data, so you don't lose valuable data and you don't recreate wheels
  • Normalize your records so that different job titles, company names, schools, etc match even if they are expressed differently (that's also called Canonicalization and we'll cover it in detail as C Word #6!)
  • Expose more of your data.  Sunlight is the best disinfectant, and if you are able to let people see data easily, you'll identify and be able to solve issues

That is only a few of the ways that you can use algorithms and technology (especially data lakes) to keep your talent data clean across the full talent lifecycle.  Clean data through pre-hire, hire, engage, develop and even alumni ultimately helps you make better talent decisions, and that's important.

SwoopTalent offers powerful software to help you manage the data in your system. This software interfaces with your ATS and RMS in order to make sure that all your potential candidate data is accurate, up-to-date, and clean. By integrating across internal and external recruitment initiatives, SwoopTalent can ensure that you don't have to duplicate your work to get the candidates you want. Swoop provides a platform for all four of the C words we have covered to date, Connected - Current - Convenient and Clean.

In truth you will never get 100% clean talent data, especially not when you need to use unstructured data (and you do!).  But you can't let the perfect be the enemy of the good in an age where data drives AI, analytics and machine learning.  


Takeaway:  Technology can automatically clean your data for AI, analytics and decision making

Obsessing about having “perfect” data is a waste of your time when we have automated methods of ensuring your data is acceptably clean, and certainly very usable. Coupling clean data with compliant data is taking it to a whole new level. Compliance isn't fun, but is something you really have to do, and that's why we cover data compliance in Part five of the 7 C’s of Talent Data...next 

Photo by pan xiaozhen on Unsplash

Want the full whitepaper?

At the end of the series we'll release a full paper of the 7 key C words in talent data, including a few BONUS case studies!  Sign up here and we'll email it to you as soon as it is ready! 

You May Also Like

These Stories on Talent Data

Subscribe by Email

No Comments Yet

Let us know what you think

HR Tech Central