Data Fabric vs MDM

New terms are constantly emerging in the data management space. One of these is “Data Fabric”.

According to Gartner, the analyst firm, data fabric “enables frictionless access and sharing of data in a distributed network environment.” Usually, one would associate data fabric with big data and edge computing. However, data fabric does embrace all kind of data and computing from the ones mentioned over multi-cloud to traditional on-premise computing and the data stores within.

Data fabric and Master Data Management (MDM) have the same aim, which is that all (master) data must be shared across the enterprise – and eventually also in business ecosystems. This is a prerequisite for successful digital transformation.

Lately, there has also been a development in the conception of MDM, where other data than master data are encompassed in some of the platforms offered as examined in the post Master Data, Product Information, Reference Data and Other Data.

So, there is clearly a union and an intersection of data fabric and MDM.

Data Fabric vs MDM

Five Product Information Management Core Aspects

A Product Information Management (PIM) solution must encompass some core aspects of handling product data in a digitalized world where products are exchanged online in self-service scenarios. Here are five essential aspects:

Product Identification

PIM idThe most common external identifier of a product is a GTIN (Global Trade Identification Number) which has those three most common formats:

  • 12-digit UPC – Universal Product Code, which is popular in North America
  • 13- digit EAN – European/International Article Number, which is popular in Europe
  • 14-digit GTIN, which is meant to replace among others the two above

We know these numbers from the barcodes on goods in physical shops.

It is worth noticing that a GTIN is applied to each packing level for a product model. So, if we for example have a given model of a magic wand, there could be three GTINs applied:

  • One for a single magic wand
  • One for a box of 25 magic wands
  • One for a pallet of 50 boxes of magic wands

Also, the GTIN is applied to a specific variant of a product model. So, if we have a given model of a pair of trousers, there will be a GTIN for each size and colour variant.

This level of product is also referred to as a SKU – Stock Keeping Unit.

Besides the GTIN (UPC/EAN) system there are plenty of industry and national number and code systems in play.

Product Classification

PIM classThere are many reasons for why you need to classify your range of products. Therefore, there are also many ways of doing so. You can either use an external classification system or your homegrown classification tailored to your organizations view of the world.

Here are five examples of an external standard:

  • UNSPSC (United Nations Standard Products and Services Code) is managed by GS1 US™ for the UN Development Programme (UNDP). This is an open, global, multi-sector standard for classification of products and services. This standard is often used in public tenders and at some marketplaces.
  • GPC (Global Product Classification) is created by GS1 as a separate standard classification within its network synchronization called the Global Data Synchronization Network (GDSN).
  • Harmonized System (HS) codes are commodity codes lately being worldwide harmonized to represent the key classifier in international trade. They determine customs duties, import and export rules and restrictions as well as documentation requirements. National statistical bureaus may require these codes from businesses doing foreign trade.
  • eCl@ss is a cross-industry product data standard for classification and description of products and services emphasizing on being a ISO/IEC compliant industry standard nationally and internationally. The classification guides the eCl@ss standard for product attributes (in eClass called properties) that are needed for a product with a given classification.
  • ETIM develops and manages a worldwide uniform classification for technical products. This classification guides the ETIM standard for product attributes (in ETIM called features) that are needed for a product with a given classification.

Within each organization you can have one – and often several – homegrown classification schemes that exist in besides the external ones relevant in each organization. One example is how you arrange your range of products on a webshop similar to how you would arrange the goods in aisles in a physical shop.

Specific product attributes

PIM attributesWhen selling products in self-service scenarios a main challenge is that each classification of products needs a specific set of attributes (sometimes called properties and features) in order to provide the set of information needed to support a buying decision.

So, while some attributes are common for all products there will be a set of attributes needed to be populated to have data completeness for this product while these attributes are irrelevant for another product belonging to another classification.

External standards as eClass and ETIM includes a scheme that names and states the attributes needed for a product belonging to a certain classification.

Related products

PIM relationA core challenge in self-service selling is that you have to mimic what a salesman does: If you enter a shop to buy an intended product, the salesman will like you to walk away with a better (and more expensive) choice along with some other products you would need to fulfil the intended purpose of use.

A common trick in a webshop is to present what other users also bought or looked at. That is the crowdsourcing approach. But it does not stop there. You must also present precisely what accessories that goes with a given product model. You must be able to present a replacement if the intended product is not available anymore (or temporarily out of stock). You can present up-sell options based on the features in question. You can present x-sell options based on the intended purpose of use.

Digital Assets

PIM assetWhen your prospective customer can’t see and feel a product online you must present product images of high quality that shows the product (and not a lot of other things too). It can be product images taken from a range of different angles. You can also provide video clips with the given product.

Besides that, there may be many other types of digital assets related to each product model. This can be installation guides, line drawings, certificates and more.

How to Select Your MDM / PIM / DQM Solution

The solution selection service on this site has been up and running for nearly two months now.

This service is unique because:

  • It is an individual list based on your context, scope and requirements.
  • It is near real-time:
    • You fill in your information here on the site. On average previous requesters spent 15 minutes to do so.
    • You will get the report back quickly. On average previous requesters have received the report within 12 hours.

The solutions considered are those who are:

You can try the free service here: Select your solution.

Select your solution 20191103

Get a Grip on Data Quality Dimensions

Data Quality Dimensions Wordle

Data quality dimensions are some of the most used terms when explaining why data quality is important, what data quality issues can be and how you can measure data quality. Ironically, we sometimes use the same data quality dimension term for two different things or use two different data quality dimension terms for the same thing. Some of the troubling terms are:

Validity / Conformity – same same but different

Validity is most often used to describe if data filled in a data field obeys a required format or are among a list of accepted values. Databases are usually well in doing this like ensuring that an entered date has the day-month-year sequence asked for and is a date in the calendar or to cross check data values against another table and see if the value exist there.

The problems arise when data is moved between databases with different rules and when data is captured in textual forms before being loaded into a database.

Conformity is often used to describe if data adheres to a given standard, like an industry or international standard. This standard may due to complexity and other circumstances not or only partly be implemented as database constraints or by other means. Therefore, a given piece of data may seem to be a valid database value but not being in compliance with a given standard.

For example, the code value for a colour being “0,255,0” may be the accepted format and all elements are in the accepted range between 0 and 255 for a RGB colour code. But the standard for a given product colour may only allow the value “Green” and the other common colour names and “0,255,0” will when translated end up as “Lime” or “High green”.

Accuracy / Precision – true, false or not sure

The difference between accuracy and precision is a well-known statistical subject.

In the data quality realm accuracy is most often used to describe if the data value corresponds correctly to a real-world entity. If we for example have a postal address of the person “Robert Smith” being “123 Main Street in Anytown” this data value may be accurate because this person (for the moment) lives at that address.

But if “123 Main Street in Anytown” has 3 different apartments each having its own mailbox, the value does not, for a given purpose, have the required precision.

If we work with geocoordinates we have the same challenge. A given accurate geocode may have the sufficient precision to tell the direction to the nearest supermarket is, but not precise enough to know in which apartment the out-of-milk smart refrigerator is.

Timeliness / Currency – when time matters

Timeliness is most often used to state if a given data value is present when it is needed. For example, you need the postal address of “Robert Smith” when you want to send a paper invoice or when you want to establish his demographic stereotype for a campaign.

Currency is most often used to state if the data value is accurate at a given time – for example if “123 Main Street in Anytown” is the current postal address of “Robert Smith”.

Uniqueness / Duplication – positive or negative

Uniqueness is the positive term where duplication is the negative term for the same issue.

We strive to have uniqueness by avoiding duplicates. In data quality lingo duplicates are two (or more) data values describing the same real-world entity. For example, we may assume that

  • “Robert Smith at 123 Main Street, Suite 2 in Anytown”

is the same person as

  • “Bob Smith at 123 Main Str in Anytown”

Completeness / Existence – to be, or not to be

Completeness is most often used to tell in what degree all required data elements are populated.

Existence can be used to tell if a given dataset has all the needed data elements for a given purpose defined.

So “Bob Smith at 123 Main Str in Anytown” is complete if we need name, street address and city, but only 75 % complete if we need name, street address, city and preferred colour and preferred colour is an existent data element in the dataset.

Data Quality Management 

Master Data Management (MDM) solutions and specialized Data Quality Management (DQM) tools have capabilities to asses data quality dimensions and improve data quality within the different data quality dimensions.

Check out the range of the best solutions to cover this space here on the list.

What ERP Applications Do and Don’t Do

The functionality of Master Data Management (MDM) and Product Information Management (PIM) solutions are in many organizations (yet) being taken care of by ERP applications.

However, there are some serious shortcomings in this approach.

If we look at party master data (customer roles and supplier roles) a classic system landscape can besides the ERP application also have a CRM application and a separate SRM (Supplier Relationship Management / Supplier Onboarding) application. The master data entities covered by these applications are not the same.

ERP do dont party

Party master data

On the sell side the CRM application will typically also hold a crowd of prospective customers that are not (yet) onboarded into the ERP application. In many cases the CRM application will also have records describing indirect customers that will never be in the ERP application. Only the existing direct customers are shared between the CRM and ERP application. Besides that, the ERP application may have accounts receivable records that have never been onboarded through the CRM application.

On the buy side a functionality of an SRM application is to track the onboarding process and thereby be the system of record for prospective suppliers. Only existing suppliers will be shared between the ERP application and the SRM application. Besides that, the ERP application will have accounts payable records that have never been onboarded through the SRM application.

A main reason of being for a Master Data Management (MDM) solution is to provide a shared registry of every real-world party entity now matter in what application they are described and thereby ensuring consistency, uniqueness and other data quality dimensions.

When looking at product data, ERP applications must often be supplemented by other applications in order to handle detailed and specific topics.

ERP do dont product

Product data

Product Lifecycle Management (PLM) applications are becoming popular when enterprise units as R&D, product management and others have to be supported in handling the series of detailed events that takes place from when a new product is thought of for the first time all through that it is retired and even after that in the period where complaints and other events may occur. ERP applications can only properly handle the main status events as when the product is ready for sale for the first time, when sale is blocked and when the last piece is taken away from the inventory.

Product Information Management (PIM) applications are becoming popular when enterprise units as sales and marketing need to provide specific product data that varies between different product groups. Not at least the rise of ecommerce has driven a demand for providing very detailed and specific product information to support self-service selling. ERP applications are not built to cater for this complexity and the surrounding functionality.

The information demand in this scenario does also encompass handling a variety of digital assets going from product images in many angles, line drawings, videos and more. Depending on the range of requirements this may be handled in a PIM application or separately in a DAM (Digital Asset Management) application.

Where there is no PIM and/or PLM solution in place, the fallback solution to cover the requirements not fulfilled by ERP is a bunch of spreadsheets.

The reason of being for multidomain MDM solutions is to cover the full spectrum of party entities, product entities together with other master data domains as locations and assets.

Check out the range of solutions to cover this space on this list.