Six MDMographic Stereotypes

In the Select your solution service here on the site there are some questions about the scope of the intended MDM / PIM / DQM solutions and the number of master data entity records. These are among others:

  • How many B2C customer (consumer, citizen) records are in scope for the solution?
  • How many B2B customer / supplier (company) records are in scope for the solution?
  • How many product (SKU) records are in scope for the solution?

When looking at what the needed disciplines, capabilities and eventually what solution is the best fit there are some stereotypes of organizations where we see the same requirements. Here are six such stereotypes:

MDMographic Stereotypes

In type A, B and C party master data management is in focus, as the number of products (or services) is limited. This is common for example in the financial services, telco and utility sectors.

Type A is where we have both B2C customers and B2B customers. Besides B2B customers we also have suppliers and some company master data entities act both in customer and supplier roles or in other business partner (BP) roles.

Type B is where the business model is having B2B customers. One will though always find some anomalies where the customers are private, with selling to employees as one example.

Type C is where the business model is having B2C customers. One will though often find some examples of having a small portion of B2B customers as well. We find type C organizations in for example healthcare, membership and education.

In type M, D and R product master data management is of equal or more importance as party master data management is. At these stereotypes, we therefore also see the need for Product Information Management (PIM).

Type M is found at manufacturers including within pharmaceuticals. Here the number of products, customers and suppliers are in the same level. Customers are typically B2B, but we see an increasing tendency of selling directly to consumers through webshops or marketplaces. Additionally, such organizations are embarking in caring about, and keeping track of, the end costumers as in B2B2C.

Type D is merchants being B2B dealers and distributors (wholesale). Though it is still common to separate customer roles and supplier roles, we see an increasing adoption of the business partner (BP) concept, as there can be a substantial overlap of customer and supplier roles. In addition, suppliers can have fictitious customer (accounts receivable) roles for example when receiving bonusses from suppliers.

Type R is merchants being retailers. With the rise of ecommerce, retailers have the opportunity of, within the regulations in place, keeping track of the B2C customers besides what traditionally have been done in loyalty programs and more.

All master data domains, also those besides parties and products, matters in some degree to all organizations. The stereotypes guide where to begin and solution providers have the opportunity of doing well with the first domain and, if covered, proceed with the engagement when other domains come into play.

Master Data, Product Information, Reference Data and Other Data

There is a trend on the data management market that the solutions are either going very niche (best-of-breed) in the data domain covered or they are encompassing a broader range of data types.

This can be seen in the spectrum of master data and product information as reported in the post MDM, PIM or Both.

We also see that governance and management of reference data is included in addition to managing master data as told in the post What is Reference Data Management (RDM)?

Some MDM (and RDM) solutions also extend the reach to cover aspects of transaction data and big data. The main scenarios covered are:

  • Matching of party entities in traditional systems of record with the parties referenced in social streams and weblogs (systems of engagement) as well as in sensor data. This can be used in creating a Customer Data Platform (CDP).
  • Extending data quality and data performance dashboards related to master data to cover aggregated transaction data and big data held in data warehouses and data lakes by using a shared set of reference data.

When product information is to be shared in business ecosystems through Product Data Syndication (PDS), this can be accelerated by using a data lake concept and new data stores as staging areas. This is due to that a main challenge here is that the data quality standards on the providing side most often are different from the data quality standards on the receiving side.

MDM PIM RDM and other data

The diagram is a variation of a diagram included in the whitepaper Intelligent Data Hub – Taking MDM to the Next Level. The original is developed together with Salah Kamel, CEO at Semarchy

Data Matching vs Deduplication

The two terms data matching and deduplication are often used synonymously.

In the data quality world deduplication is used to describe a process where two or more data records, that describes the same real-world entity, are merged into one golden record. This can be executed in different ways as told in the post Three Master Data Survivorship Approaches.

Data matching can be seen as an overarching discipline to deduplication. Data matching is used to identify the duplicate candidates in deduplication. Data matching can also be used to identify matching data records between internal and external data sources as examined in the post Third-Party Data Enrichment in MDM and DQM.

As an end-user organization you can implement data matching / deduplication technology from either pure play Data Quality Management (DQM) solution providers or through data management suites and Master Data Management (MDM) solutions as reported in the post DQM Tools In and Around MDM Tools.

When matching internal data records against external sources one often used approach is utilizing the data matching capabilities at the third-party data provider. Such providers as Dun & Bradstreet (D&B), Experian and others offer this service in addition to offering the third-party data.

To close the circle, end-user organizations can use the external data matching result to improve the internal deduplication and more. One example is to apply a matched duns-numbers from D&B for company records as a strong deduplication candidate selection criterium. In addition, such data matching results may often result not in a deduplication, but in building hierarchies of master data.

Data Matching Deduplication

What is Product Data Syndication (PDS)?

Product Information Management (PIM) has a sub discipline called Product Data Syndication (PDS).

While PIM basically is about how to collect, enrich, store and publish product information within a given organization, PDS is about how to share product information between manufacturers, merchants and marketplaces.

Product Data Syndication World

Marketplaces

Marketplaces is the new kid on the block in this world. Amazon and Alibaba are the most known ones, however there are plenty of them internationally, within given product groups and nationally. Merchants can provide product information related to the goods they are selling on a marketplace. A disruptive force in the supply (or value) chain world is that today manufacturers can sell their goods directly on marketplaces and thereby leave out the merchants. It is though still only a fraction of trade that has been diverted this way.

Each marketplace has their requirements for how product information should be uploaded encompassing what data elements that are needed, the requested taxonomy and data standards as well as the data syndication method.

Data Pools

One way of syndicating (or synchronizing) data from manufacturers to merchants is going through a data pool. The most known one is the Global Data Synchronization Network (GDSN) operated by GS1 through data pool vendors, where 1WorldSync is the dominant one. In here trading partners are following the same classification, taxonomy and structure for a group of products (typically food and beverage) and their most common attributes in use in a given geography.

There are plenty of other data pools available emphasizing on given product groups either internationally or nationally. The concept here is also that everyone will use the same taxonomy and have the same structure and range of data elements available.

Data Standards

Product classifications can be used to apply the same data standards. GS1 has a product classification called GPC. Some marketplaces use the UNSPSC classification provided by United Nations and – perhaps ironically – also operated by GS1. Other classifications, that in addition encompass the attribute requirements too, are eClass and ETIM.

A manufacturer can have product information in an in-house ERP, MDM and/or PIM application. In the same way a merchant (retailer or B2B dealer) can have product information in an in-house ERP, MDM and/or PIM application. Most often a pair of manufacturer and merchant will not use the same data standard, taxonomy, format and structure for product information.

1-1 Product Data Syndication

Data pools have not substantially penetrated the product data flows encompassing all product groups and all the needed attributes and digital assets. Besides that, merchants also have a desire to provide unique product information and thereby stand out in the competition with other merchants selling the same products.

Thus, the highway in product data syndication is still 1-1 exchange. This highway has these lanes:

  • Exchanging spreadsheets typically orchestrated as that the merchant request the manufacturer to fill in a spreadsheet with the data elements defined by the merchant.
  • A supplier portal, where the merchant offers an interface to their PIM environment where each manufacturer can upload product information according to the merchant’s definitions.
  • A customer portal, where the manufacturer offers an interface where each merchant can download product information according to the manufacturer’s definitions.
  • A specialized product data syndication service where the manufacturer can push product information according to their definitions and the merchant can pull linked and transformed product information according to their definitions.

In practice, the chain from manufacturer to the end merchant may have several nodes being distributors/wholesalers that reloads the data by getting product information from an upstream trading partner and passing this product information to a downstream trading partner.

Data Quality Implications

Data quality is as always a concern when information producers and information consumers must collaborate, and in a product data syndication context the extended challenge is that the upstream producer and the downstream consumer does not belong to the same organization. This ecosystem wide data quality and Master Data Management (MDM) issue was examined in the post Multienterprise MDM.

Third-Party Data Enrichment in MDM and DQM

An often-requested capability in Master Data Management (MDM) and Data Quality Management (DQM) is data enrichment from – and verification against – third-party data providers. The data providers can be government data providers, commercial data providers and open data providers.

The two most common used scenarios are:

  • Data enrichment from – and verification against – business directories
  • Verification against – and enrichment from – address directories

Business directory integration

Integration with business directories is done with party master data as B2B customers and suppliers. The aim is often to enrich already gathered internal master data with external data such as:

  • Industry sector codes as SIC or NACE codes
  • Company family trees
  • Credit worthiness supporting data

Sometimes you may also want to (conditionally) overwrite – or supplement – internal gathered data such as:

  • Company name
  • Addresses
  • Phone numbers

You may also want to verify that a business exists and catch when a business dissolve.

Integration can be done with:

  • Global business directories, where Dun & Bradstreet is the most prominent. The advantage here is a uniform integration point and data structure.
  • National directories for each country often supplied by a government body. The advantage here is localized data fit for national requirements and optimal freshness.

Address verification

Verifying a postal address – and translating it into a standard format – is done with location master data that most often are part of party master with emphasis on B2C customer data.

Also, in this case there are global versus national options.

Some MDM / DQM providers have their own global services. Examples are Informatica, who acquired the service called Address Doctor, and IBM. Other MDM / DQM providers utilize the service called Loqate. The advantage here is a uniform integration point and data structure.

In many countries there are also national services that provides richer and localized data with optimal freshness. The richness may be multi-language versions, granular structures feasible in that country and property data such as which kind of building that exist on that address.

A common enrichment type is also getting the geocodes related to a postal address.

Your requirements

Your prioritization of business directory integration and address verification is part of the selection criteria here on the site in the Select your solution service.

Thrid party