What is Reference Data Management (RDM)?

One of the specialized data management solution types encompassed by this Disruptive MDM / PIM List is Reference Data Management (RDM).

Reference data are typically smaller lists of data records that are referenced by master data and transaction data. These lists do not change often. They tend to be externally defined but can also be internally defined within each organization. The below table have some examples of reference data lists used across many organizations and industries:Reference DataRDM solutions may offer this functionality around the reference data:

  • The data store that holds the data
  • The user interface for maintaining the lists
  • Access control
  • Hierarchy management as for example how countries have (or not have) states/provinces that have postal codes
  • Managing relationships and mapping between the list values as for example how a SIC industry sector code relates to NACE industry sector codes
  • Versioning of the lists
  • Language and further context management
  • Audit trails
  • Approval workflows
  • Data integration capabilities

There are applications that is purely focussing on RDM as well as MDM and broader data management solutions / suites that have RDM as a one of several capabilities where the above-mentioned functionality is shared with master data and perhaps other critical application data.

If you use the select your solution service here on the site, RDM is one of the capabilities you can mark as a requirement for your solution.

Multi-Cultural Capabilities in MDM, PIM and Data Quality Management

When implementing Master Data Management (MDM), Product Information Management (PIM) and Data Quality Management (DQM) solutions in international environments – and even in national environments in countries with more than one official language – there will be some requirements around multi-cultural aspects of the solution. These are namely:

  • Multi-language capabilities
  • Handling different character sets and script systems
  • Utilizing third party reference data

Multi-language capabilities

A lot of master data elements can be represented in different languages. Examples are:

  • City names
  • Product classifications
  • Product descriptions

The ability to store master data in different language versions, validate according to the language in play and consume the data in the right context can be an essential factor when selecting the right tool and configuration of a tool.

When doing a Proof of Concept of a solution where the end environment will be international, you must do the pilot covering several languages.

Character sets and script systems

Critical data elements must often be kept in versions covering more than one character set or even script system. Examples are:

  • Person and company names
  • Postal addresses
  • Product features

While most solutions in the MDM, PIM and DQM space has developed a lot since examples of only covering the English alphabet in capital letters, there are still capability gaps worth exploring before implementing your solution in all corners of the world.

Third party reference data

The availability and requirements around referencing and enriching with third party data from public, commercial and open sources differs a lot from country to country. This includes:

  • External company IDs as Legal Entity Identifiers, Duns Numbers, national registration numbers (SIREN, ABN, KvK, CVR…..) and VAT numbers.
  • External person IDs as national ID systems (SSN, NINO, NIF, CF, INSEE, SIN, CPR ….)
  • Postal code systems (ZIP, PLZ, PIN ….) and address directories for the given country.

When working with third party reference data for purposes of unique identification, enrichment and validation you must strike a balance between having a uniform global process and local best practices (and lawful processes) for each master data domain.

Some solutions are very fit for a given geography but may have challenges in other geographies.

Your context, scope and requirements

When selecting an MDM, PIM and/or DQM solution you must take relevant multi-cultural capabilities into consideration. The tool selection help service on this site also covers the multi-cultural context, scope and requirements. You can access Select your solution here.

Multi Cultural MDM PIM DQM

PS: The tower shown is The Tower of Babel.

Hierarchy Management in MDM and PIM

Some of the Master Data Management (MDM), Product Information Management (PIM) and Data Quality Management (DQM) capabilities recently discussed on this blog was deduplication and master data survivorship approaches and workflow management.

Another of the many requirements you can indicate in the select your solution service is hierarchy management.Hierarchy Management MDM PIMMaster Data Domains

Hierarchy management can be used in the different master data domains, namely:

  • Party master data, where we can have:
    • Companies as customers, suppliers and other business partner roles that are placed in a company family three. Here you have to place your accounts in the hierarchy as a department, legal entity, national headquarter and global headquarter.
    • Individuals as business contacts at customers, suppliers and other business partners. Here you have to cope with, that an individual can have several roles stretching over several accounts.
    • Individuals as employees, contractors and customers. Private customers can be grouped in households and demographic stereotypes.
  • Location master data, where we can have:
    • External universal hierarchies as continents, countries, states/provinces, postal districts, streets, buildings and units in buildings.
    • Internal defined geographies as sales districts and organizational regions.
  • Product master data, where we can have:
    • External product classifications as UNSPSC, GPC, HS, eClass, ETIM and many more.
    • Internal product groups that may be enterprise wide or related to reporting, sales, purchase, production and more.

Applications in Play

Traditionally these hierarchies have been managed in ERP systems around specific use cases and the functionality offered by this application. The challenge in an MDM implementation is to lift the hierarchy management into the MDM platform and cover an enterprise wide view of hierarchies in use.

Where there are both an MDM application and a PIM application there has to be a distinction between which product data hierarchies that are governed in MDM and which product hierarchies that are governed in PIM.

Handling the lists that make up the hierarchy structure may also be done in a separate Reference Data Management (RDM) solution.

Using graph technology

Picturing master data entities in a strict hierarchy where one must be above another is not always optimal in describing how the real-world looks like. Therefore, some MDM solutions also have graph technology applied.

Graph technology can enforce solutions to classic MDM and PIM challenges as:

  • Providing multiple versions of the truth
  • Handling complex relations between the different master data domains
  • Engaging business users in consuming master data

Summary

Hierarchy management is an essential capability in MDM and PIM platforms as it can foster a holistic view on customers, suppliers and other parties, locations, products and other master data entities across the enterprise and within the business ecosystem where the enterprise operate.

Master Data Workflow Management

Workflow management has become a core requirement for organizations on the look for a Master Data Management (MDM) and/or Product Data Management (PIM) solution.

In the early days of MDM solutions, these applications were often integrated with a third-party workflow engine, but today it has become most common that the workflow management capability is built-in and comes as a – sometimes extra charged – native part of the solution.

The approval process for onboarding master data entities is the most frequent use case for building a master data workflow. However, we increasingly see other use cases addressed, not at least data quality checks (which also supports the approval process).

For party master data (customer and other roles) some examples of workflow elements are:

For product master data some examples are:

  • Completeness check for various stages (sellable, catalogue ready, online ready)
  • Compliance check and approval
  • Pricing approval

Below are a few examples on how MDM/PIM vendors promote and explain their workflow offering:

Master Data Workflow Management

The examples are taken from the Contentserv MDM, Dynamicweb PIM, Reltio workflow & collaboration and Semarchy xDM pages.

Three Master Data Survivorship Approaches

One of the core capabilities around data quality in Master Data Management (MDM) solutions is providing data matching functionality with the aim of deduplicating records that describes the same real-world entity and thereby facilitate a 360 degree view of a master data entity.

Identifying the duplicates is one thing that is hard enough. However, how to resolve the result of the deduplication process is another challenge.

There are three main approaches for doing that:

Master Data Survivorship Approaches

Enlarge the image here.

In the above example we have three records: An orange, a green and a blue one. They are considered to be duplicates, meaning they describe the same real-world person. 

1: Survival of the fittest record

Selecting the record that according to a data quality rule is the most fit is the simplest approach. The rule(s) that determines which record that will survive is most often based on either:

  • Lineage, where the source systems are prioritized
  • Completeness, like for example which record has the most fields and characters filled

The downside of this approach is that surviving record only have data quality of that selected record, which might not be optimal, and that valuable information for deselected records might get lost.

Data quality tools that are good at identifying duplicates often has this simple method around survivorship.

In the above example the blue record wins and this record survives in the MDM hub, while the orange and the green record only survives in the source system(s).

2: Forming a golden record

In this approach the information from each data element (field) is selected from the record that, by given rules, is the best fit. These rules may be based on lineage, completeness, validity or other data quality dimensions.

Data elements may also be parsed, meaning that the element is split into discrete parts as for example an address line into house number and street name. The outcome may also be a union of the (parsed) data elements coming from the source systems.

In that way a new golden record is formed.

Additionally, values may also be corrected by using external directories which acts as a kind of source system.

This approach is more complex and while solving some of the data quality pain in the first approach, there will still be situations of mixing wrongly and lost information as well as it is hard to rollback an untrue result.

In the above example the golden record in the MDM hub is formed by data elements from the blue, green and orange record – and the city name is fetched from an external directory.

3: Context aware survivorship 

In this approach the identified duplicates are not physically merged and purged.

Instead, you will by applying lineage, completeness and other data quality dimension based rules be able to make several different golden record views that are fit in a given context. The results may differ both around the surviving data elements and the surviving data records.

This is the most complex approach but also the approach that potentially has the best business fit. The downsides include, besides the complexity, possible performance issues not at least in batch processing.

In the above example the MDM hub includes the orange, green and blue record and presents one surviving golden record for marketing purposes and two surviving golden records for accounting purposes.