Guest Column | August 31, 2015

An Alternative Approach To The Data Warehouse In Healthcare Analytics

By George Dealy, Vice President of Healthcare Applications, Dimensional Insight

It’s clear from past experience traditional ways of applying business intelligence and analytics technology to healthcare information haven’t worked particularly well. The “lessons learned” narratives on healthcare analytics at various conferences seem very consistent:

  • be careful about taking on too much too fast
  • focus on delivering “early and often” to get constructive feedback
  • make sure that the user community will derive value from what you do deliver

Of particular note is the dissatisfaction with the data warehouse model. One of the biggest criticisms is that this model often represents a “build it and they will come” approach. Unfortunately, these types of data warehouse-focused projects are often not implemented with a focus on the problems that need to be solved and end up being “solutions looking for problems”. This leads to a high failure rate, which Gartner says is 80 percent for these types of projects. (Source: Gartner, “Top Actions for Healthcare Delivery Organization CIOs, 2014: Avoid 25 Years of Mistakes in Enterprise Data Warehousing,” Feb. 2014.)

Rather than accepting an approach that’s clearly not working, healthcare organizations need a new way to think about analytics – an approach that at its core considers the goals of business, operational and clinical communities, and aims to help them reach those goals quickly and efficiently. Here, we will examine such an approach and discuss why it may not require a data warehouse.

Old Way Of Thinking: Data As Static
A data warehouse is typically defined as a central repository of an organization’s most important data. It’s the place from which data can be conveniently and flexibly accessed independent of source transactional systems. In practice, this is a very stationary concept: once in the warehouse, most of the data will change very little. However, some of the most interesting and useful information is that which reflects what’s happening right now, and that’s very dynamic. So, most of the warehouse is devoted to data that will likely be used less and less over time.

New Way Of Thinking: Data In Motion
An alternative view would be to think of data as being in constant motion. What’s most important in this model is a continuous and timely flow of information that represents the current state of the organization and its processes, as well as how that state is changing over time. For example, what is the trend in our hospital census from hour to hour and how is it being impacted by an outbreak of the flu? We may be able to use this type of information to make more optimal decisions if we can understand it from multiple perspectives. That’s analytics at its essence. But to be effective, this approach requires the most relevant information to be available when the decision needs to be made.

For many scenarios such as this one, we don’t necessarily need a comprehensive data warehouse. Instead the emphasis would shift to extracting data from its original sources and combining and transforming that data into the information that’s most useful to the user community. That information could also be stored at an appropriate level of summarization to provide historical perspective. The result is similar to that provided by the data warehouse, but with an emphasis on the aspects of the problem that represent the highest value to the user community.

It’s important here to distinguish between data warehouses and operational data stores. Most EHRs and other transactional systems maintain their own operational data stores for offline reporting and archiving. Data warehouses in many respects duplicate the role of the operational data store. If you’re going to invest a lot of time, money and effort, doesn’t it make sense to focus on things that will add value rather than re-creating what already exists?

Other systems embrace this notion of data in motion. For example, consider the way that Google handles Internet data. It doesn’t store information from different websites all over again. That would take far too many resources, and it’s just not necessary. Rather, it indexes data so you can explore it in contextually relevant ways.

We can think about analyzing healthcare data in a similar manner. The real goal is to derive knowledge from the data. What’s arguably more important than the storage of data is how it can be transformed into new information that will guide you toward better decisions and actions. But the age-old challenge lies in making this quickly evolving knowledge readily and consistently accessible to the people who can best take advantage of it.

Where Do You Begin?
Adopting this alternative approach allows us to change the way we implement analytics. No longer do we need to start by “boiling the ocean” (tackling everything at once) as often happens with data warehousing projects. Rather, we can focus on a limited number of opportunities to produce new information and knowledge that will lead to high value, measurable outcomes.

This is a noble goal, but where should you focus your efforts to make sure you get off to a good start? There are three areas that are important to address before any analytics project starts rolling down the tracks:

  • user community perspectives, opportunities, and imperatives
  • data governance
  • business rules

If you consider these areas, you’ll be able to stay ahead of the game and out of trouble – and ultimately increase your probability of success both near term and over the long run.

User Perspectives, Opportunities, And Imperatives
The first step to success is to understand your users’ domain, empathize with their challenges and priorities, and comprehend information from their perspective.

  • Understand the most critical problems users are trying to solve and determine if and how analytics could help.
  • Identify what data would be necessary for a useful analytics solution. Determine what transformations may be necessary for the data to be meaningful. Data from transactional systems often doesn’t include useful measurements. It will be up to you to figure out what measurements are most valuable and how to define them.
  • Decide what visual contexts will do the most effective job of telling the story of the data to the user community.

Details are ultimately important, but you need a reason to get to them. Visualization is about identifying and understanding patterns in order to see the forest through the trees, which is most effectively done through a combination of measurement and summarization.

The most important information is typically highly summarized and presented in a visually intuitive way. The very granular, detailed data may ultimately be important, but getting there is typically a result of finding a pattern worth investigating.

Data Governance
The second step to success is data governance. This refers to processes that dictate how data is managed in your organization. This is critical to address prior to kicking off an analytics project so that everyone in your organization can trust the validity of your data. The problem in many organizations is that business users typically know what they want to do with data, but they often don’t understand the intricacies involved in conditioning it so it reflects reality. Conversely, IT has a good handle on the data, but doesn’t always understand how it gets used. That’s why ironing out data governance and understanding one another’s point of view is critical prior to any analytics project.

Getting a handle on data governance requires:

  • bringing together the user community, including leadership
  • reaching consensus – or agree to disagree on certain issues
  • formalizing the process

Business Rules
The third step is the issue of business rules, which are intertwined with data governance. Business rules refer to the transformations that are applied to your data between the original data source and the user presentation. An example in healthcare would be how length of stay is measured. This measure is not “one size fits all” and there is a legitimate need for multiple definitions that help answer different types of questions. Are you focused on certain populations: acute vs non-acute, adult vs. pediatric, elective vs. emergent? Do you need to stratify by disease type? Are you considering patients that haven’t been discharged from the hospital yet? Before you can begin meaningfully analyzing your data, you’ll need to both define and implement these rules. Then, most importantly, you need to “close the loop” through the data governance process to ensure that everyone understands what measure they are working with, specifically how it’s defined, and what it’s intended to measure.

Done well, business rules are the foundation of that “single version of the truth” that is so essential to effective analytics. They help determine what data is actually needed to produce relevant and reliable quantitative measurements. But getting to consensus on definitions is difficult work. It requires collaboration, common understanding, and compromise.

Coming to agreement on measures and rules may be the most important, and challenging, part of the whole analytics puzzle. Be forewarned: if you shortcut this step, you’ll pay the price down the line as you’ll leave it up to individual users to “roll their own” business rules.

Conclusion
When it comes to healthcare analytics, you don’t have to accept the status quo of the data warehouse model. There are other ways to approach analytics that don’t require a data warehouse and make implementation much more successful in the long run. By focusing primarily on users, organizational goals, data governance, and business rules, you can find true analytics success.

About The Author
George Dealy is vice president of healthcare applications at Dimensional Insight (www.dimins.com), named 2014 “Best in KLAS” for business intelligence/analytics. You can reach Dimensional Insight on Twitter at @DI_tweet.