<img src="https://secure.leadforensics.com/75129.png" style="display:none;">
Request a Demo
Menu
Take a Talent Data Assessment
Request a Demo

Talent Data C Words Part 6:  Canonicalized

SwoopTalent
July 25, 2019
This post is the sixth of a 7 part series on mastering Talent Data - The 7 C's of Talent Data!

Canonicalized? Say what?  Hey, let's ask Techopedia:

Canonicalization is the process of converting data that involves more than one representation into a standard approved format.

For example, a staff engineer, a sr. software developer and a "ninja" may be all the same thing in the tech industry.  Canonicalization ensures they are labeled the same way.  Very, VERY useful stuff when you are searching, analyzing and using data.

What is Canonicalized Data?

Canonicalization is the process of taking the nine hundred ways you can type something, and tagging them all with a term that is the same for each.  Not CHANGING them all, necessarily, but making sure they are linked.

Canonical or Organized Data Management

When the data your team needs exists in many different sources and is created by many different people, you can end up with a lot of terms for the equivalent.  Left as it stands, you need to know every variant to be able to group, find or research.  The classic examples in HR and recruiting are job titles, schools and company names.  Have you ever thought about how many different ways there are to express U.Mich?

Remember when the head of sales was a Vice President of Sales? Now we have everything from Director of Business Development to Chief Revenue Officer to, ahem, "growth hacker".  You could completely overlook the skills, experience, and knowledge for a job match or even some detailed analyses if the data is not normalized or canonicalized.

Many talent data systems also have less-than-robust search features. The way forward is to find a new organizational or canonical method to gather everything into a searchable database. Connect your ATS data with the other CRM's your team is using. Connecting and curating the information in a data lake can make organization faster and easier.

It is important to be able to match and normalize data that exists from multiple sources, including your ATS, HRMS, LMS, resumes, job boards, communities, social media and your CRM.  Today, technology can do that matching and normalizing for you, using deep learning rather than manual effort.

By automating the process of canonicalizing these "messy" data sets, you will:

  • include ALL the relevant records in your efforts
  • stop missing talent because they have unusual job titles
  • never, ever again need to do a "data cleanse" project!
  • reduce (even eliminate) the number of data translations you use
  • reduce your maintenance effort
  • allow yourself to get more done!

This is why automating canonicalization is so important.  And it's JUST as important to save the original data and add an extra normalized value, so you don't lose the true history from the source.  You just might want to refer to the original data in a communication with the person, and you do NOT want their job title wrong!

Idea-square-optTakeaway:  You must normalize your job titles, schools and company data, while also keeping the originals

Organizing Data From Multiple HR Technology Systems and Sources

Easily accessing data from more HR technology sources and systems and in less time enables users to make better decisions more quickly, accelerating collaborative and analytical processes. It also helps HR users improve their interactions with customers and colleagues alike. Operations become more efficient because of the data being more accessible and quickly harvested in real time.

And if you canonicalize data at the same time, you save practitioner time, reduce mistakes and make mutual understanding (apples and apples) much easier.   Handily, this is an excellent task for machine learning.  Because machine learning can be used to look across huge data sets and identify patterns, your data can be accurately canonicalized with virtually zero manual effort.  And without messing up your source data, either.  

Another thing technology can do for you and your data!

Idea-square-optTakeaway:  Machine learning can automate your data canonicalization, so normalizing is easy now

As we move on to on our final C word, CURATE, we want to point out that the tools have already been developed to accomplish every C! SwoopTalent's platform can do it all for you, automatically. When your teams switch from system to system to do the same work, you hurt productivity. With SwoopTalent solutions, you see all your data, across all your systems, in one place. It's that simple. SwoopTalent can bring the 7 C’s of Talent Data to life. See how

You might want to catch up on the other C words we have already covered: Connected, Current, Convenient, Clean, and Compliant.

Photo by Alex Block on Unsplash

Want the full whitepaper?

At the end of the series we'll release a full paper of the 7 key C words in talent data, including a few BONUS case studies!  Sign up here and we'll email it to you as soon as it is ready! 

You May Also Like

These Stories on talent data as a service

Subscribe by Email

No Comments Yet

Let us know what you think