The Important Role of Metadata in Business Intelligence

Ever try to put something together without reading the instructions? We’ve all done it, and then we had to pull the thing apart and read the instructions to figure out where we went wrong before putting it back together. Metadata is a piece of the overall instruction set of a BI effort, part of the circuitry of your Data Warehouse environment. Let me explain.

What is Metadata?

Simply put, Metadata is data about data. Usually, it doesn’t mean much—unless you have to describe something. But someone who accesses, develops, or maintains data needs to know exactly what he’s looking at. In his book, The Data Warehouse Toolkit (2nd edition), modern Data Warehousing founder Ralph Kimball defines metadata as “all the information that defines and describes the structures, operations, and contents of the DW/BI system.” Kimball describes three types:

    1. Technical Metadata defines the objects and processes which comprise the DW/BI system.
      Technical, or Physical, Metadata is what is stored in your data source. It’s your physical schema.  It’s your tables, columns, and the data stored in those objects. The data dictionary is typically built from this metadata (usually by the IT department). When the data dictionary is compared with the metadata dictionary, a gap analysis of missing/incomplete data can be performed.
    2. Business Metadata describes the data warehouse contents in user terms, including what data is available, where it came from, what it means, and how it relates to other data.
      Business, or Logical, Metadata is how the business or end user defines their business, along with the terms and calculations they use to perform their job. A metadata dictionary is a non-technical description of what needs to be reported on, and how data should be described.  The definitions here tend to include calculations more complex than a simple mapping to the physical columns in the source.  Reports, extracts, and access layers can be readily mapped using this metadata.
    3. Process Metadata describes the warehouse’s operational results. Process, or Operational, Metadata is information that ties to the metrics capture when a system executes, including traceability, lineage, and auditing info.  When did the system run? For how long? Where did it come from? This Metadata is most useful in debugging, and in determining the health of the overall system and data.

Why use Metadata?

Whether it’s in the cloud or on-premise, the location of your metadata doesn’t matter. Metadata is a critical aspect of the business intelligence effort when building a data warehouse or data mart. A well-written metadata dictionary becomes the inventory list that catalogs what data is to be stored. BI looks at this data in a manner opposite to transactional applications. In transactional applications, you know what you’re going to populate and create code to align to that data. In contrast, BI looks at what the business user needs to report on, and drives back to get that data. So it’s critical we understand how and what data they need to do their job. Remember, data that doesn’t exist—or is improperly formatted—is still needed. Spreadsheets get created and become an ever-present stop gap that is difficult to get rid of, if at all possible.
Here’s a common scenario: ever had difficulty getting two individuals, managers, or organizations to agree on how something is defined? Unless this definition is in writing and agreed-upon, ambiguity or conflicting definitions quickly become critical issues for development, data maintenance, data quality, and reporting (i.e. a hit to your budget). In addition to being a BI project building block as part of the design phase, a well-written metadata dictionary becomes a source for input into your Data Governance process to referee data conflicts and confusion. Without it, the amount of rework, bad data, and loss of user confidence (especially in a large effort) is staggering.
Is your organization lacking a Data Warehouse or BI environment? If you do any type of reporting at all, you need a clear definition set in writing to describe your terms and the calculations you’re reporting on—which you can then use to better scale your business. If you’d like some help building your Data Warehouse or BI environment, check out our Data Services or reach out. We’d love to get you started.

New call-to-action

Share on

Facebook sharing Linkedin sharing button Twitter sharing button

Ready to get started?

Enter your information to keep the conversation going.
Location image
4 Sentry Parkway East, Suite 300, Blue Bell PA, 19422

Email Image

Phono Image610 239 8100

Location Image4 Sentry Parkway East, Suite 300, Blue Bell PA, 19422
Phono Image610 239 8100