Hierarchy Management in MDM and PIM

Some of the Master Data Management (MDM), Product Information Management (PIM) and Data Quality Management (DQM) capabilities recently discussed on this blog was deduplication and master data survivorship approaches and workflow management.

Another of the many requirements you can indicate in the select your solution service is hierarchy management.

Hierarchy Management MDM PIM

Master Data Domains

Hierarchy management can be used in the different master data domains, namely:

  • Party master data, where we can have:
    • Companies as customers, suppliers and other business partner roles that are placed in a company family thee. Here you have to place your accounts in the hierarchy as a department, legal entity, national headquarter and global headquarter.
    • Individuals as business contacts at customers, suppliers and other business partners. Here you have to cope with, that an individual can have several roles stretching over several accounts.
    • Individuals as employees, contractors and customers. Private customers can be grouped in households and demographic stereotypes.
  • Location master data, where we can have:
    • External universal hierarchies as continents, countries, states/provinces, postal districts, streets, buildings and units in buildings.
    • Internal defined geographies as sales districts and organizational regions.
  • Product master data, where we can have:
    • External product classifications as UNSPSC, GPC, HS, eClass, ETIM and many more.
    • Internal product groups that may be enterprise wide or related to reporting, sales, purchase, production and more.

Applications in Play

Traditionally these hierarchies have been managed in ERP systems around specific use cases and the functionality offered by this application. The challenge in an MDM implementation is to lift the hierarchy management into the MDM platform and cover an enterprise wide view of hierarchies in use.

Where there are both an MDM application and a PIM application there has to be a distinction between which product data hierarchies that are governed in MDM and which product hierarchies that are governed in PIM.

Handling the lists that make up the hierarchy structure may also be done in a separate Reference Data Management (RDM) solution.

Using graph technology

Picturing master data entities in a strict hierarchy where one must be above another is not always optimal in describing how the real-world looks like. Therefore, some MDM solutions also have graph technology applied.

Graph technology can enforce solutions to classic MDM and PIM challenges as:

  • Providing multiple versions of the truth
  • Handling complex relations between the different master data domains
  • Engaging business users in consuming master data

Summary

Hierarchy management is an essential capability in MDM and PIM platforms as it can foster a holistic view on customers, suppliers and other parties, locations, products and other master data entities across the enterprise and within the business ecosystem where the enterprise operate.

Master Data Workflow Management

Workflow management has become a core requirement for organizations on the look for a Master Data Management (MDM) and/or Product Data Management (PIM) solution.

In the early days of MDM solutions, these applications were often integrated with a third-party workflow engine, but today it has become most common that the workflow management capability is built-in and comes as a – sometimes extra charged – native part of the solution.

The approval process for onboarding master data entities is the most frequent use case for building a master data workflow. However, we increasingly see other use cases addressed, not at least data quality checks (which also supports the approval process).

For party master data (customer and other roles) some examples of workflow elements are:

For product master data some examples are:

  • Completeness check for various stages (sellable, catalogue ready, online ready)
  • Compliance check and approval
  • Pricing approval

Below are a few examples on how MDM/PIM vendors promote and explain their workflow offering:

Master Data Workflow Management

The examples are taken from the Contentserv MDM, Dynamicweb PIM, Reltio workflow & collaboration and Semarchy xDM pages.

Three Master Data Survivorship Approaches

One of the core capabilities around data quality in Master Data Management (MDM) solutions is providing data matching functionality with the aim of deduplicating records that describes the same real-world entity and thereby facilitate a 360 degree view of a master data entity.

Identifying the duplicates is one thing that is hard enough. However, how to resolve the result of the deduplication process is another challenge.

There are three main approaches for doing that:

Master Data Survivorship Approaches

Enlarge the image here.

In the above example we have three records: An orange, a green and a blue one. They are considered to be duplicates, meaning they describe the same real-world person. 

1: Survival of the fittest record

Selecting the record that according to a data quality rule is the most fit is the simplest approach. The rule(s) that determines which record that will survive is most often based on either:

  • Lineage, where the source systems are prioritized
  • Completeness, like for example which record has the most fields and characters filled

The downside of this approach is that surviving record only have data quality of that selected record, which might not be optimal, and that valuable information for deselected records might get lost.

Data quality tools that are good at identifying duplicates often has this simple method around survivorship.

In the above example the blue record wins and this record survives in the MDM hub, while the orange and the green record only survives in the source system(s).

2: Forming a golden record

In this approach the information from each data element (field) is selected from the record that, by given rules, is the best fit. These rules may be based on lineage, completeness, validity or other data quality dimensions.

Data elements may also be parsed, meaning that the element is split into discrete parts as for example an address line into house number and street name. The outcome may also be a union of the (parsed) data elements coming from the source systems.

In that way a new golden record is formed.

Additionally, values may also be corrected by using external directories which acts as a kind of source system.

This approach is more complex and while solving some of the data quality pain in the first approach, there will still be situations of mixing wrongly and lost information as well as it is hard to rollback an untrue result.

In the above example the golden record in the MDM hub is formed by data elements from the blue, green and orange record – and the city name is fetched from an external directory.

3: Context aware survivorship 

In this approach the identified duplicates are not physically merged and purged.

Instead, you will by applying lineage, completeness and other data quality dimension based rules be able to make several different golden record views that are fit in a given context. The results may differ both around the surviving data elements and the surviving data records.

This is the most complex approach but also the approach that potentially has the best business fit. The downsides include, besides the complexity, possible performance issues not at least in batch processing.

In the above example the MDM hub includes the orange, green and blue record and presents one surviving golden record for marketing purposes and two surviving golden records for accounting purposes.