Duplicates vs Nodes in MDM Hierarchies

Identification of duplicate records is a core capability in both Data Quality Management (DQM) and in Master Data Management (MDM).

When you inspect records identified as duplicate candidates, you will often have to decide if they describe the same real-world entity or if they describe two real-world entities belonging to the same hierarchy.

Instead of throwing away the latter result, this link can be stored in the MDM hub as well as a relation in a hierarchy (or graph) and thus support a broader range of operational and analytic purposes.

Individual Persons and Households

In business-to-consumer (B2C) scenarios a key challenge is to have 360 degree view of private customers either as individual persons or a household with a shared economy.

Here you must be able to distinguish between the individual person, the household and people who just happen to live at the same postal address. The location hierarchy plays a role in solving this case. This quest includes having precise addresses when identifying units in large buildings and knowing the kind of building. The probability of two John Smith records being the same person differs if it is a single-family house address or the address of a nursing home.

Companies / Organizations in Company Family Trees

In business-to-business (B2B) scenarios a key challenge is to have 360 degree view of these customers. Similar 360 scenarios exist with suppliers and other business partners.

Organizations can belong to a company family tree. A basic representation for example used in the Dun & Bradstreet Worldbase is having branches at a postal address. These branches belong a legal entity with a headquarter at a given postal address, where there may be other individual branches too. Each legal entity in an enterprise may have a national ultimate mother. In multinational enterprises, there is a global ultimate mother. Public organizations have similar often very complex trees.

Products by Variant and Sourcing

Products are also formed in hierarchies. The challenge is to identify if a given product record points to a certain level in the bottom part of a given product hierarchy. Products can have variants in size, colour and more. A product can be packed in different ways. The most prominent product identifier is the Global Trade Identification Number (GTIN) which occur in various representations as for example the Universal Product Code (UPC) popular in North America and European (now International) Article Number (EAN) popular in Europe. These identifiers are applied by each producer (and in some cases distributor) at the product packing variant level.

Another uniqueness issue for products is around what is called multi-sourcing, being that the same product from the same original manufacturer can be sourced through more than one supplier each with their pricing, discount model, terms of delivery and terms of payment.

Solutions Available

When looking for a solution to support you in this conundrum the best fit for you may be a best-of-breed Data Quality Management (DQM) tool and/or a capable Master Data Management (MDM) platform.

This Disruptive MDM / PIM /DQM List has the most innovative candidates here.

Get Your Free Tailored MDM / PIM / DQM Solution List

Many analyst market reports in the Master Data Management (MDM), Product Information Management (PIM) and Data Quality Management (DQM) space have a generic ranking of the vendors.

The trouble with generic ranking is that one size does not fit all.

On this list there is no generic ranking. Instead there is a service where you can provide your organization’s context, scope and requirements and within 2 to 48 hours get Your Solution List.

The selection model includes these elements:

  • Your context in terms of geographical reach and industry sector.
  • Your scope in terms of data domains to be covered and organizational scale stretching from specific departments over enterprise-wide to business ecosystem wide (interenterprise).
  • Your specific requirements covering the main requirements that differentiate the vendors on market.
  • Vendor capabilities.
  • A model that combines those facts into a rectangle where you can choose to:
    • Go ahead with a Proof of Concept with the best fit vendor
    • Make an RFP with the best fit vendors in a shortlist
    • Examine a longlist of best fit vendors and other alternatives like combining more than one solution.

Select Your Solution Model

The vendors included are both the major players on the market as well as emerging solutions with innovative offerings.

You can get your free solution list here.

Contextual MDM vs Enterprise-Wide, Global, Multidomain MDM

The term “contextual Master Data Management” has been floating around in a couple of years. We can see contextual MDM as smaller pieces of MDM with a given flavour as for example focussing on sub/overlapping disciplines as:

The focus can also be at:

  • A given locality
  • A given master data domain as customer, supplier, employee, other/all party, product (beyond PIM), location or asset
  • A given business unit

You must eat an elephant one bite at a time. Therefore, contextual MDM makes a good concept for getting achievable wins.   

However, in an organization with high level of data management maturity the range of contextual MDM use cases, and the solutions for them, will be encompassed by a common enterprise-wide, global, multidomain MDM framework – either as one solution or a well-orchestrated set of solutions.

One example with dependencies is when working with personalization as part of Product Experience Management (PXM). Here you need customer personas. The elephant in the room, so to speak, is that you have to get the actual personas from Customer MDM and/or the Customer Data Platform (CDP).

The list of solutions on this site covers both one-stop-shopping options for all contextual MDM use cases and specialised solutions for a given contextual MDM use case. Check the growing list here.

What is a Golden Record within Data Management?

The term golden record is a core concept within Master Data Management (MDM) and Data Quality Management (DQM). A golden record is a representation of a real world entity. This representation may be compiled from multiple different representations of that entity in a single or in multiple different databases within the enterprise system landscape.

A golden record is optimized towards meeting data quality dimensions as:

  • Being a unique representation of the real world entity described
  • Having a complete description of that entity covering all purposes of use in the enterprise
  • Holding the most current and accurate data values for the entity described

In Multidomain MDM we work with a range of different entity types as party (with customer, supplier, employee and other roles), location, product and asset. The golden record concept applies to all of these entity types, but in slightly different ways.

Party Golden Record

Having a golden record that facilitates a single view of a customer is probably the most known example of using the golden record concept. Managing customer records and dealing with duplicates of those is the most frequent data quality issue around.

If you are not able to prevent duplicate records from entering your MDM world, which is the best approach, then you have to apply data matching capabilities. When identifying a duplicate you must be able to intelligently merge any conflicting views into a golden record as examined in the post Three Master Data Survivorship Approaches.

In lesser degree we see the same challenges in getting a single view of suppliers and you ultimately will want to have a single view on any business partner, also where the same real world entity have both customer, supplier and other roles to your organization.

There are party identification systems available. Most countries have national ID systems for both citizens (however in most countries mostly restricted to public administration) and organizations. There is Legal Entity Identifier (LEI) concept slowly penetrating in financial services. Also, there are commercial organization identifiers as the Duns Number available.

Location Golden Record

Having the same location only represented once in a golden record and applying any party, product and asset record, and ultimately golden record, to that record may be seen as quite academic. Nevertheless, striving for that concept will solve many data quality conundrums.

Location management have different meanings and importance for different industries. One example is that a brewery makes business with the legal entity (party) that owns a bar, café, restaurant. However, even though the owner of that place changes, which happens a lot, the brewery is still interested in being the brand served at that place. Also, the brewery wants to keep records of logistics around that place and the historic volumes delivered to that place. Utility and insurance are other examples of industries where the location golden record (should) matter a lot.

Knowing the properties of a location also supports the party deduplication process. For example, if you have two records with the name “John Smith” on the same address, the probability of that John Smith being the same real world entity is dependent on whether that location is a single-family house or a nursing home.

Location identification concepts revolves around postal adresses, which are fluffy and varies in format by country, and geocoding systems as latitude/longitude, UTM coordinates, WGS coordinates and more.

Golden Records

Product Golden Record

Product Information Management (PIM) solutions became popular with the raise of multi-channel where having the same representation of a product in offline and online channels is essential. The self-service approach in online sales also drew the requirements of managing a lot more product attributes than seen before, which again points to a solution of handling the product entity centralized.

In large organizations that have many business units around the world you struggle with having a local view and a global view of products. A given product may be a finished product to one unit but a raw material to another unit. Even a global SAP rollout will usually not clarify this – rather the contrary.

While third party reference data helps a lot with handling golden records for party and location, this is lesser the case for product master data. Classification systems and data pools do exist, but will certainly not take you all the way. With product master data you must rely more on second party master data meaning sharing product master data within the business ecosystems where you operate.

The none-profit organization GS1 has done a lot in implementing the Global Trade Item Number (GTIN) based on the Universal Product Code (UPC) and the European Article Number (EAN) concept. However there are still some challenges in this concept around packaging levels and more.

Asset (or Thing) Golden Record

In asset master data management you also have different purposes where having a single view of a real world asset helps a lot. There are namely financial purposes and logistic purposes that have to aligned, but also a lot of others purposes depending on the industry and the type of asset.

With the raise of the Internet of Things (IoT) we will have to manage a lot more assets (or things) than we usually have considered. When a thing (a machine, a vehicle, an appliance) becomes intelligent and now produces big data, master data management and indeed multi-domain master data management becomes imperative.

You will want to know a lot about the product model of the thing in order to make sense of the produced big data. For that, you need the product (model) golden record. You will want to have deep knowledge of the location in time of the thing. You cannot do that without the location golden records. You will want to know the different party roles in time related to the thing. The owner, the operator, the maintainer. If you want to avoid chaos, you need party golden records.

Tools That Can Help

This site has a list of innovative MDM and DQM solution that can help you mastering golden records. Check out the list here.

Analyst MDM / PIM / DQM Solution Reports Update Mid 2020

Analyst firms occasionally publish market reports with a generic solution overview for Master Data Management (MDM), Product Information Management (PIM) and Data Quality Management (DQM).

Here is an overview of the latest major reports:

Analyst Rankings

The next expected reports include:

  • Information Difference yearly MDM landscape probably later this month
  • Gartner Data Quality Tool Magic Quadrant scheduled for 31st July

PS: You can check out many of the included solutions on This Disruptive MDM / PIM / DQM List.

PPS: You can get a free ranking that also include the rising stars on the solution market and is based on your context, scope and requirements here.