Six AI and ML Use Cases Within MDM

One of the hottest trends in the Master Data Management (MDM) world today is how to exploit Artificial Intelligence (AI) and ignite that with Machine Learning (ML).

This aspiration is not new. It has been something that have been going on for years and you may argue about when computerized decision support and automation goes from being applying advanced algorithms to being AI. However, the AI and ML theme is getting traction today as part of digital transformation and whatever we call it, there are substantial business outcomes to pursue.

As told in the post Machine Learning, Artificial Intelligence and Data Quality perhaps all use cases for applying AI is dependent on data quality and MDM is playing a crucial role in sustaining data quality efforts.

Here are six use cases that are commonly being addressed by AI and ML capabilities:

AI MDM DQ Use Cases

  • Translating between taxonomies: As reported in the post Artificial Intelligence (AI) and Multienterprise MDM emerging technologies can help in translating between the taxonomies in use when digital transformation sets a new bar for utilizing master data in business ecosystems.
  • Transforming unstructured to structured: A lot of data is kept in an unstructured way and to in order to systematically exploit these data in AI supported business process we need make data more structured. AI and ML can help with that too.
  • Data quality issue prevention: Simple rules for checking integrity and validating data is good – but unfortunately not good enough for ensuring data quality. AI is a way to exploit statistical methods and complex relationships.
  • Categorizing data: Digital transformation, spiced up with increasing compliance requirements, has made data categorization a must and AI and ML can be an effective way to solve this task that usually is not possible for humans to cover across an enterprise.
  • Data matching: Establishing a link between multiple descriptions of the same real-world entity across an enterprise and out to third party reference data has always been a pain. AI and ML can help as examined in the post The Art in Data Matching.
  • Improving insight: The scope of MDM can be enlarged to Extended MDM Platforms where other data as transactions and big data is used to build a 360-degree view of the master data entities. AI and ML is a prerequisite to do that.

Five Product Information Management Core Aspects

A Product Information Management (PIM) solution must encompass some core aspects of handling product data in a digitalized world were products are exchanged online in self-service scenarios. Here are five essential aspects:

Product Identification

PIM idThe most common external identifier of a product is a GTIN (Global Trade Identification Number) which has those three most common formats:

  • 12-digit UPC – Universal Product Code, which is popular in North America
  • 13- digit EAN – European/International Article Number, which is popular in Europe
  • 14-digit GTIN, which is meant to replace among others the two above

We know these numbers from the barcodes on goods in physical shops.

It is worth noticing that a GTIN is applied to each packing level for a product model. So, if we for example have a given model of a magic wand, there could be three GTINs applied:

  • One for a single magic wand
  • One for a box of 25 magic wands
  • One for a pallet of 50 boxes of magic wands

Also, the GTIN is applied to a specific variant of a product model. So, if we have a given model of a pair of trousers, there will be a GTIN for each size and colour variant.

This level of product is also referred to as a SKU – Stock Keeping Unit.

Besides the GTIN (UPC/EAN) system there are plenty of industry and national number and code systems in play.

Product Classification

PIM classThere are many reasons for why you need to classify your range of products. Therefore, there are also many ways of doing so. You can either use an external classification system or your homegrown classification tailored to your organizations view of the world.

Here are five examples of an external standard:

  • UNSPSC (United Nations Standard Products and Services Code) is managed by GS1 US™ for the UN Development Programme (UNDP). This is an open, global, multi-sector standard for classification of products and services. This standard is often used in public tenders and at some marketplaces.
  • GPC (Global Product Classification) is created by GS1 as a separate standard classification within its network synchronization called the Global Data Synchronization Network (GDSN).
  • Harmonized System (HS) codes are commodity codes lately being worldwide harmonized to represent the key classifier in international trade. They determine customs duties, import and export rules and restrictions as well as documentation requirements. National statistical bureaus may require these codes from businesses doing foreign trade.
  • eCl@ss is a cross-industry product data standard for classification and description of products and services emphasizing on being a ISO/IEC compliant industry standard nationally and internationally. The classification guides the eCl@ss standard for product attributes (in eClass called properties) that are needed for a product with a given classification.
  • ETIM develops and manages a worldwide uniform classification for technical products. This classification guides the ETIM standard for product attributes (in ETIM called features) that are needed for a product with a given classification.

Within each organization you can have one – and often several – homegrown classification schemes that exist in besides the external ones relevant in each organization. One example is how you arrange your range of products on a webshop similar to how you would arrange the goods in aisles in a physical shop.

Specific product attributes

PIM attributesWhen selling products in self-service scenarios a main challenge is that each classification of products needs a specific set of attributes (sometimes called properties and features) in order to provide the set of information needed to support a buying decision.

So, while some attributes are common for all products there will be a set of attributes needed to be populated to have data completeness for this product while these attributes are irrelevant for another product belonging to another classification.

External standards as eClass and ETIM includes a scheme that names and states the attributes needed for a product belonging to a certain classification.

Related products

PIM relationA core challenge in self-service selling is that you have to mimic what a salesman does: If you enter a shop to buy an intended product, the salesman will like you to walk away with a better (and more expensive) choice along with some other products you would need to fulfil the intended purpose of use.

A common trick in a webshop is to present what other users also bought or looked at. That is the crowdsourcing approach. But it does not stop there. You must also present precisely what accessories that goes with a given product model. You must be able to present a replacement if the intended product is not available anymore (or temporarily out of stock). You can present up-sell options based on the features in question. You can present x-sell options based on the intended purpose of use.

Digital Assets

PIM assetWhen your prospective customer can’t see and feel a product online you must present product images of high quality that shows the product (and not a lot of other things too). It can be product images taken from a range of different angles. You can also provide video clips with the given product.

Besides that, there may be many other types of digital assets related to each product model. This can be installation guides, line drawings, certificates and more.

What ERP Applications Do and Don’t Do

The functionality of Master Data Management (MDM) and Product Information Management (PIM) solutions are in many organizations (yet) being taken care of by ERP applications.

However, there are some serious shortcomings in this approach.

If we look at party master data (customer roles and supplier roles) a classic system landscape can besides the ERP application also have a CRM application and a separate SRM (Supplier Relationship Management / Supplier Onboarding) application. The master data entities covered by these applications are not the same.

ERP do dont party

Party master data

On the sell side the CRM application will typically also hold a crowd of prospective customers that are not (yet) onboarded into the ERP application. In many cases the CRM application will also have records describing indirect customers that will never be in the ERP application. Only the existing direct customers are shared between the CRM and ERP application. Besides that, the ERP application may have accounts receivable records that have never been onboarded through the CRM application.

On the buy side a functionality of an SRM application is to track the onboarding process and thereby be the system of record for prospective suppliers. Only existing suppliers will be shared between the ERP application and the SRM application. Besides that, the ERP application will have accounts payable records that have never been onboarded through the SRM application.

A main reason of being for a Master Data Management (MDM) solution is to provide a shared registry of every real-world party entity now matter in what application they are described and thereby ensuring consistency, uniqueness and other data quality dimensions.

When looking at product data, ERP applications must often be supplemented by other applications in order to handle detailed and specific topics.

ERP do dont product

Product data

Product Lifecycle Management (PLM) applications are becoming popular when enterprise units as R&D, product management and others have to be supported in handling the series of detailed events that takes place from when a new product is thought of for the first time all through that it is retired and even after that in the period where complaints and other events may occur. ERP applications can only properly handle the main status events as when the product is ready for sale for the first time, when sale is blocked and when the last piece is taken away from the inventory.

Product Information Management (PIM) applications are becoming popular when enterprise units as sales and marketing need to provide specific product data that varies between different product groups. Not at least the rise of ecommerce has driven a demand for providing very detailed and specific product information to support self-service selling. ERP applications are not built to cater for this complexity and the surrounding functionality.

The information demand in this scenario does also encompass handling a variety of digital assets going from product images in many angles, line drawings, videos and more. Depending on the range of requirements this may be handled in a PIM application or separately in a DAM (Digital Asset Management) application.

Where there is no PIM and/or PLM solution in place, the fallback solution to cover the requirements not fulfilled by ERP is a bunch of spreadsheets.

The reason of being for multidomain MDM solutions is to cover the full spectrum of party entities, product entities together with other master data domains as locations and assets.

Check out the range of solutions to cover this space on this list.

Six MDMographic Stereotypes

In the Select your solution service here on the site there are some questions about the scope of the intended MDM / PIM / DQM solutions and the number of master data entity records. These are among others:

  • How many B2C customer (consumer, citizen) records are in scope for the solution?
  • How many B2B customer / supplier (company) records are in scope for the solution?
  • How many product (SKU) records are in scope for the solution?

When looking at what the needed disciplines, capabilities and eventually what solution is the best fit there are some stereotypes of organizations where we see the same requirements. Here are six such stereotypes:

MDMographic Stereotypes

In type A, B and C party master data management is in focus, as the number of products (or services) is limited. This is common for example in the financial services, telco and utility sectors.

Type A is where we have both B2C customers and B2B customers. Besides B2B customers we also have suppliers and some company master data entities act both in customer and supplier roles or in other business partner (BP) roles.

Type B is where the business model is having B2B customers. One will though always find some anomalies where the customers are private, with selling to employees as one example.

Type C is where the business model is having B2C customers. One will though often find some examples of having a small portion of B2B customers as well. We find type C organizations in for example healthcare, membership and education.

In type M, D and R product master data management is of equal or more importance as party master data management is. At these stereotypes, we therefore also see the need for Product Information Management (PIM).

Type M is found at manufacturers including within pharmaceuticals. Here the number of products, customers and suppliers are in the same level. Customers are typically B2B, but we see an increasing tendency of selling directly to consumers through webshops or marketplaces. Additionally, such organizations are embarking in caring about, and keeping track of, the end costumers as in B2B2C.

Type D is merchants being B2B dealers and distributors (wholesale). Though it is still common to separate customer roles and supplier roles, we see an increasing adoption of the business partner (BP) concept, as there can be a substantial overlap of customer and supplier roles. In addition, suppliers can have fictitious customer (accounts receivable) roles for example when receiving bonusses from suppliers.

Type R is merchants being retailers. With the rise of ecommerce, retailers have the opportunity of, within the regulations in place, keeping track of the B2C customers besides what traditionally have been done in loyalty programs and more.

All master data domains, also those besides parties and products, matters in some degree to all organizations. The stereotypes guide where to begin and solution providers have the opportunity of doing well with the first domain and, if covered, proceed with the engagement when other domains come into play.

Data Matching vs Deduplication

The two terms data matching and deduplication are often used synonymously.

In the data quality world deduplication is used to describe a process where two or more data records, that describes the same real-world entity, are merged into one golden record. This can be executed in different ways as told in the post Three Master Data Survivorship Approaches.

Data matching can be seen as an overarching discipline to deduplication. Data matching is used to identify the duplicate candidates in deduplication. Data matching can also be used to identify matching data records between internal and external data sources as examined in the post Third-Party Data Enrichment in MDM and DQM.

As an end-user organization you can implement data matching / deduplication technology from either pure play Data Quality Management (DQM) solution providers or through data management suites and Master Data Management (MDM) solutions as reported in the post DQM Tools In and Around MDM Tools.

When matching internal data records against external sources one often used approach is utilizing the data matching capabilities at the third-party data provider. Such providers as Dun & Bradstreet (D&B), Experian and others offer this service in addition to offering the third-party data.

To close the circle, end-user organizations can use the external data matching result to improve the internal deduplication and more. One example is to apply a matched duns-numbers from D&B for company records as a strong deduplication candidate selection criterium. In addition, such data matching results may often result not in a deduplication, but in building hierarchies of master data.

Data Matching Deduplication