Master Data, Product Information, Reference Data and Other Data

There is a trend on the data management market that the solutions are either going very niche (best-of-breed) in the data domain covered or they are encompassing a broader range of data types.

This can be seen in the spectrum of master data and product information as reported in the post MDM, PIM or Both.

We also see that governance and management of reference data is included in addition to managing master data as told in the post What is Reference Data Management (RDM)?

Some MDM (and RDM) solutions also extend the reach to cover aspects of transaction data and big data. The main scenarios covered are:

  • Matching of party entities in traditional systems of record with the parties referenced in social streams and weblogs (systems of engagement) as well as in sensor data. This can be used in creating a Customer Data Platform (CDP).
  • Extending data quality and data performance dashboards related to master data to cover aggregated transaction data and big data held in data warehouses and data lakes by using a shared set of reference data.

When product information is to be shared in business ecosystems through Product Data Syndication (PDS), this can be accelerated by using a data lake concept and new data stores as staging areas. This is due to that a main challenge here is that the data quality standards on the providing side most often are different from the data quality standards on the receiving side.

MDM PIM RDM and other data

The diagram is a variation of a diagram included in the whitepaper Intelligent Data Hub – Taking MDM to the Next Level. The original is developed together with Salah Kamel, CEO at Semarchy

What is Product Data Syndication (PDS)?

Product Information Management (PIM) has a sub discipline called Product Data Syndication (PDS).

While PIM basically is about how to collect, enrich, store and publish product information within a given organization, PDS is about how to share product information between manufacturers, merchants and marketplaces.

Product Data Syndication World

Marketplaces

Marketplaces is the new kid on the block in this world. Amazon and Alibaba are the most known ones, however there are plenty of them internationally, within given product groups and nationally. Merchants can provide product information related to the goods they are selling on a marketplace. A disruptive force in the supply (or value) chain world is that today manufacturers can sell their goods directly on marketplaces and thereby leave out the merchants. It is though still only a fraction of trade that has been diverted this way.

Each marketplace has their requirements for how product information should be uploaded encompassing what data elements that are needed, the requested taxonomy and data standards as well as the data syndication method.

Data Pools

One way of syndicating (or synchronizing) data from manufacturers to merchants is going through a data pool. The most known one is the Global Data Synchronization Network (GDSN) operated by GS1 through data pool vendors, where 1WorldSync is the dominant one. In here trading partners are following the same classification, taxonomy and structure for a group of products (typically food and beverage) and their most common attributes in use in a given geography.

There are plenty of other data pools available emphasizing on given product groups either internationally or nationally. The concept here is also that everyone will use the same taxonomy and have the same structure and range of data elements available.

Data Standards

Product classifications can be used to apply the same data standards. GS1 has a product classification called GPC. Some marketplaces use the UNSPSC classification provided by United Nations and – perhaps ironically – also operated by GS1. Other classifications, that in addition encompass the attribute requirements too, are eClass and ETIM.

A manufacturer can have product information in an in-house ERP, MDM and/or PIM application. In the same way a merchant (retailer or B2B dealer) can have product information in an in-house ERP, MDM and/or PIM application. Most often a pair of manufacturer and merchant will not use the same data standard, taxonomy, format and structure for product information.

1-1 Product Data Syndication

Data pools have not substantially penetrated the product data flows encompassing all product groups and all the needed attributes and digital assets. Besides that, merchants also have a desire to provide unique product information and thereby stand out in the competition with other merchants selling the same products.

Thus, the highway in product data syndication is still 1-1 exchange. This highway has these lanes:

  • Exchanging spreadsheets typically orchestrated as that the merchant request the manufacturer to fill in a spreadsheet with the data elements defined by the merchant.
  • A supplier portal, where the merchant offers an interface to their PIM environment where each manufacturer can upload product information according to the merchant’s definitions.
  • A customer portal, where the manufacturer offers an interface where each merchant can download product information according to the manufacturer’s definitions.
  • A specialized product data syndication service where the manufacturer can push product information according to their definitions and the merchant can pull linked and transformed product information according to their definitions.

In practice, the chain from manufacturer to the end merchant may have several nodes being distributors/wholesalers that reloads the data by getting product information from an upstream trading partner and passing this product information to a downstream trading partner.

Data Quality Implications

Data quality is as always a concern when information producers and information consumers must collaborate, and in a product data syndication context the extended challenge is that the upstream producer and the downstream consumer does not belong to the same organization. This ecosystem wide data quality and Master Data Management (MDM) issue was examined in the post Multienterprise MDM.

Extended MDM Platforms

There is a tendency on the Master Data Management (MDM) market that solutions providers aim to deliver an extended MDM platform to underpin customer experience efforts. Such a platform will not only handle traditional master data, but also reference data, big data (as data lakes) either directly or by linking to the data in there as well as linking to transactions.

The recent acquisition of AllSight by Informatica is an example hereof.

In this context traditional MDM will, supplemented with Reference Data Management (RDM), enable the handling of:

  • Customer, supplier and product identity
  • Customer, supplier and product hierarchies
  • Customer, supplier and product locations

Additionally, the data lake concept can be used for:

Extended MDM Platforms

What is your view: Should MDM solution providers stick to traditional master data or should they strive to encompass other kinds of data too?

Master Data Management Definitions: The A-Z of MDM. Part 3

This guest blog post is written by Justine Aa. Rodian of Stibo SystemsThe post is part 3 in a series of 3. Please find part 1 here and part 2 here.

img_A-Z_post3P

Party data. In relation to Master Data Management, party data is understood in two different ways. First of all, party data can mean data defined by its source. You will typically hear about first, second and third-party data. First-party data being your own data, second-party data being someone else’s first-party data handed over to you, while third-party data is collected by someone with no relation to you—and probably sold to you. However, when talking about party data management, party data refers to master data typically about individuals and organizations with relation to, for example, customer master data. A party can in this context be understood as an attorney or husband of a customer that plays a role in a customer transaction, and party data is then data referring to these parties. Party data management can be part of an MDM setup, and these relations can be organized using hierarchy management.

Learn more about party data here.

PII. Personally Identifiable Information. In Europe often just referred to as personal information. PII is sensitive information that identifies a person, directly (on its own) or indirectly (in combination). Examples of direct PII include name, address, phone number, email address and passport number, while examples of indirect PII include a combination (e.g., workplace and job title or maiden name in combination with date and place of birth).

Product Information Management (PIM). Today sometimes also referred to as Product MDM, Product Data Management (PDM) or Master Data Management for products. No matter the naming, PIM refers to a set of processes used to centrally manage and evaluate, identify, store, share and distribute product data or information about products. PIM is enabled with the implementation of PIM or Product Master Data Management software.

Learn more here.

Product Lifecycle Management (PLM). The process of managing the entire lifecycle of a product from ideation, through design, product development, sourcing and selling. The backbone of PLM is a business system that can efficiently handle the product information full-circle, and significantly increase time to market through streamlined processes and collaboration. That can be a standalone PLM tool or part of a comprehensive MDM platform.

Learn more here.

Pool. A data pool is a centralized repository of data where trading partners (e.g., retailers, distributors or suppliers) can obtain, maintain and exchange information about products in a standard format. Suppliers can, for instance, upload data to a data pool that cooperating retailers can then receive through their data pool.

Platform. A comprehensive technology used as a base upon which other applications, processes or technologies are developed. An example of a software platform is an MDM platform.

Profiling. Data profiling is a technique used to examine data from an existing information source, such as a database, to determine its accuracy and completeness and share those findings through statistics or informative summaries. Conducting a thorough data profiling assessment in the beginning of a Master Data Management implementation is recognized as a vital first step toward gaining control over organizational data as it helps identify and address potential data issues, enabling architects to design a better solution and reduce project risk.

Q

Quality. As in data quality, also sometimes just shortened into DQ. An undeniable part of any MDM vendor’s vocabulary as a high level of data quality is what a Master Data Management solution is constantly seeking to achieve and maintain. Data quality can be defined as a given data set’s ability to serve its intended purpose. In other words, if you have data quality, your data is capable of delivering the insight you require. Data quality is characterized by, for example, data accuracy, validity, reliability, completeness, granularity, consistency and availability.

R

Reference data. Data that define values relevant to cross-functional organizational transactions. Reference data management aims to effectively define data fields, such as units of measurements, fixed conversion rates and calendar structures, to “translate” these values into a common language in order to categorize data in a consistent way and secure data quality. Reference Data Management (RDM) systems can be the solution for some organizations, while others manage reference data as part of a comprehensive Master Data Management setup.

S

SaaS. Software as a Service. A software licensing and delivery model in which software is licensed on a subscription basis and is centrally hosted. SaaS is on the rise, due to change in consumer behavior and based on the higher demand for a more flat-rate pricing model, since these solutions are typically paid on a monthly or quarterly basis. SaaS is typically used in cloud MDM, for instance.

Supply Chain Management (SCM). The management of material and information flow in an organization—everything from product development, sourcing, production and logistics, as well as the information systems—to provide the highest degree of customer satisfaction, on time and at the lowest possible cost. A PLM solution or PLM MDM solution can be a critical factor for driving effective supply chain management.

Silos. When navigating the MDM landscape you will often come across the term data silos. A term describing when crucial data or information, such as master data, is held separately whether by individuals, departments, regions or systems. MDMs’ finest purpose is to “break down data silos.”

Stock Keeping Unit (SKU). A SKU represents an individual item, product or service manifested in a code, uniquely identifying that item, product or service. SKU codes are used in business to track inventory. It’s often a machine-readable bar code, providing an additional layer of uniqueness and identification.

Stack. The collection of software or technology that forms an organization’s operational infrastructure. The term stack is used in reference to software (software stack), technology (technology stack) or simply solution (solution stack) and refers to the underlying systems that make your business run smoothly. For instance, an MDM solution can—in combination with other solutions—be a crucial part of your software stack.

Stewardship. Data stewardship is the management and oversight of an organization’s data assets to help provide business users with high-quality data that is easily accessible in a consistent manner. Data stewards will often be the ones in an organization responsible for the day-to-day data governance.

Strategy. As with all major business initiatives, MDM needs a thorough, coherent, well-communicated business strategy in order to be as successful as possible.

Supplier data. Data about suppliers. One of the domains on which MDM can be beneficial. May be included in an MDM setup in combination with other domains, such as product data.

Learn more about supplier data here.

Synchronization. The operation or activity of two or more things at the same time or rate. Applied to data management, data synchronization is the process of establishing data consistency from one endpoint to another and continuously harmonizing the data over time. MDM can be the key enabler for global or local data synchronization.

Syndication. Data syndication is basically the onboarding of data provided from external sources, such as suppliers. An MDM solution will typically automate the process of receiving external data while making sure that high-quality criteria are met.

Swamp. A data swamp is a deteriorated data lake, that is inaccessible to its intended users and provides little value.

T

Training. No, not the type that goes on in a gym. Employee training, that is. MDM is not just about software. It’s about the people using the software, hence they need to know how to use it best in order to maximize the Return on Investment (ROI). MDM users will have to receive training from either the MDM vendor, consultants or from your employees who already have experience with the solution.

U

User Interface (UI). The part of the machine that handles the human–machine interaction. In an MDM solution—and in all other software solutions—users have an “entrance,” an interface from where they are interacting with and operating the solution. As is the case for all UIs, the UI in an MDM solution needs to be user-friendly and intuitive.

V

Vendor. There are many Master Data Management vendors on the market. How do you choose the right one? It all depends on your business needs, as each vendor is often specialized in some areas of MDM more than others. However, there are some things you generally should be aware of, such as scalability (Is the system expandable in order to grow with your business?), proven success (Does the vendor have solid references confirming the business value?) and integration (Does the solution integrate with the systems you need it to?).

W

Warehouse. A data warehouse—or EDW (Enterprise Data Warehouse)—is a central repository for corporate information and data derived from operational systems and external data sources, used to generate analytics and insight. In contrast to the data lake, a data warehouse stores vast amounts of typically structured data that is predefined before entering the data warehouse. The data warehouse is not a replacement for Master Data Management, as MDM can support the EDW by feeding reliable, high-quality data into the system. Once the data leaves the warehouse, it is often used to fuel Business Intelligence.

Workflow automation. An essential functionality in an MDM solution is the ability to set up workflows, a series of automated actions for steps in a business process. Preconfigured workflows in an MDM solution generate tasks, which are presented to the relevant business users. For instance, a workflow automation is able to notify the data steward of data errors and guide him through fixing the problem.

Y

Yottabyte. Largest data storage unit (i.e., 1,000,000,000,000,000,000,000,000 bytes). No Master Data Management solution, or any other data storage solution, can handle this amount yet. But, scalability should be a considerable factor for which MDM solution you choose.

Z

ZZZZZ… With a Master Data Management solution placed at the heart of your organization you get to sleep well at night, knowing your data processes are supported and your information can be trusted.

If you’d like the whole A-Z e-book in a downloadable format, please find it here.

Justine Aagaard Rodian is a marketing specialist at Stibo Systems with a background as a journalist. Five years in the data management industry has armed Justine with unique insights and she is now using her storytelling and digital skills to spread valuable business knowledge about Master Data Management and related topics.