Steps to Ensure Consistent, Accurate, and “Clean” Data

As learners are added to the platform, there are important factors to keep in mind to ensure the accuracy and consistency of the data set for each learner. Data that is coded in a consistent manner without errors will result in what is frequently referred to as “clean data,” the lack of which will prevent you from experiencing the full benefits of the platform’s reporting and analytics features.

What data should you include?

Our platform offers the following data fields you can associate with each learner:

Field Column Heading to Use in CSV UTF-8 File (must match exactly)
First Name* first_name
Last Name* last_name
Email Address* email
External GUID** external_guid
Region region
Career Level career_level
Hire Date hire_date
Organization Unit 1*** organization_unit1
Organization Unit 2*** organization_unit2
Organization Unit 3*** organization_unit3
Location**** location
Title**** title
Manager First Name***** manager_assigned_first_name
Manager Last Name***** manager_assigned_last_name
Manager Email***** manager_assigned_email

*Required in order to make an account

**Only required for organizations using Single Sign On. Consult your Customer Success representative for more information .

***Custom field that can include whatever data your organization prefers (e.g. division), but which stays the same across the organization. Learn more about how to configure custom Organization Unit data fields.

****Completed by the learner when they update their profile.

*****Used to designate the learner’s manager for use with the My Team feature.

Why does consistent data entry matter?

The data fields themselves are standard, but the codes you use for each learner within each of the fields is flexible. For example, what your organization wishes to regard as different regions or career levels is up to your organization; there are no standards you are forced to use. However, you must remember to stay consistent in your coding and use the same nomenclature for all learners you add to the system. If you are inconsistent, leading to “unclean data”, the analytics dashboards treat each variation as a separate code and the graphs and charts included will not provide the highest possible value without data cleanup.

Common examples of inconsistent data

Inconsistent data takes many forms and has many implications throughout the platform.

Inconsistencies that will lead to separate codes in all analytics reports:

  • Using abbreviations sometimes and not others, e.g. “North America” vs. “NA”
  • Variations in punctuation - e.g. “NA” vs. “N.A.”
  • Variations in capitalization - e.g. “NA” vs. “na”
  • Using different scope of measurement - e.g. “North America” vs. “United States” vs “California”
  • Different cohorts using different units of measurement or different data categories altogether (i.e. one cohort’s roster being uploaded with Organization Unit 1 representing company division, while another cohort’s roster has Organization Unit 1 representing job function).

In the example above, the Organization Unit 1 field includes divisions (e.g. “Finance”), US states (e.g. “Cleveland”), and regions (e.g. “Asia Pacific”).

Inconsistencies that will lead to errors when bulk uploading users into the platform:

  • Using column headings that do not exactly match the labels used by our platform (listed above), including proper use of underscores, spaces, and capitalization. Note: Even if your file is uploaded without errors, data could still be missing from learners’ profiles if the column headings are not labeled properly 
  • Using special characters in any data field 
  • Using your organization’s labels instead of the platform labels for the three custom fields, organization_unit1, organization_unit2, and organization_unit3
  • Not completing external_guid if your organization uses Single Sign On
  • Not using a CSV UTF-8 file
  • Including special characters in your CSV file or having a very long file name

Therefore, we strongly encourage creating a data template and sharing that with all current and future groups at your organization using the cohort learning platform. Please speak with your Customer Success representative for more information.

What do I need to consider when setting up my data fields?

While it may seem unnecessary at first, it’s important to think long-term about how the data from the platform will be used to provide value to your organization. Without proper planning, you are likely to run into a situation where inconsistencies cause issues.

Questions to consider:

  1. Will other groups within your organization use this cohort learning platform? If so, would they find value in the way you’d like to code custom data fields (i.e. Organization Units 1-3) and other data fields?
  2. For the custom data fields, what data is most valuable to your company? Is it more important to distinguish learner performance data by division, for example, or is there another field that would be valuable and necessary to include?
  3. How will the learner data be added to each course roster? Will all groups using the platform go through HR at your company, or will they build the roster themselves? If there is a consistent process to create rosters, you can work with the appropriate team members to ensure consistency.

Clean and consistent data is an extremely important component of the cohort learning process. If you are unsure about any aspect of the data management process, please contact your Customer Success representative.

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.