Four things that must happen for successful health data sharing

Published in

Maxos Digital Securities

6 min readJul 13, 2017

The last 20 years of work and billions of dollars spent on health data “interoperability” have failed to give us the benefits of real sharing: choice of health care providers, better continuity of care, lower cost, lower errors, and faster discover of treatments that work. The architecture is fatally flawed. Logically, it can’t work to connect every health care provider with every other health care provider, and expect them to understand each other’s data. We can stop wasting vast amounts of time and money by focusing on a few points of architecture that logically must be present for successful data sharing

1) Demand

The original sin of the meaningful use initiative was to try to push a supply of EMR data into a market without a demand pull. The inevitable result is that the data is bad, and the initiative is underfunded. If nobody is using the data and demanding quality, why make it good? If nobody is paying for the data, why invest in supplying it?

Architecturally, we must start with demand. Where there is demand, investment will inevitably follow.

Doctors do not contribute to the demand for data. Doctors have their own way of gathering information from patients that are in front of them. They are suspicious of information from other sources, they don’t process big data the way computers do, and they don’t want to be held responsible for monitoring real time sensor data. Patients don’t demand data either. They have rejected every attempt to give them personal health records. They don’t care about data, and they are strangely uninterested in their own health. They do, however, periodically want good health care advice.

Fortunately, there are new ways to convert data into good advice. An industry has emerged to convert bulk data into knowledge and advice with the help of computer analytics. The AI beast is hungry. The people that run health care providers would love to use data-driven apps, and not scarce and variable doctors, to provide real-time advice and consistent experiences to their customers. A host of new mobile and telehealth app vendors are supplying tools to those providers, and selling data-driven advice directly to consumers. Device vendors are supplying the apps. Thus, we now see an economy emerging that acquires data, analyzes it with computers, and provides doctor-not-present advice with continuously improving consistency. These services are (fortunately) much less expensive than traditional medical labor, but they will generate more than enough money to pay for a great leap forward in our data economy.

So, new demand is lying in wait. We should quantify it by asking these parties how much they will pay. That will unlock investment. I’m hoping to create more demand with events like the Connected Health Vision Workshops, which will give people an opportunity to pitch all of the great ways they will improve health and health care when they have access to rich data about their customers. This will motivate a complete, demand-driven data economy.

2) There must be a way to get and interpret the data that is actually available

It’s a fantasy of those working on interoperability that every provider of data will provide access to that data with a standard FHIR API, opening the door to the multitude of health care vendors, apps, AI advisers, and researchers that could benefit from this data. Realizing this fantasy will take more than a decade. First, we need to agree on the standards. Then, the health care providers have to implement systems that implement those standards. These providers may be good at providing health care, but they are very slow at implementing information technology and not reliable at implementing standards. After that is done, we will still be left with a semantic problem. Each field of data in our “standard” is likely to have a different meaning from provider to provider.

So, if we want to see progress in our lifetimes, it won’t help us to wait for perfectly standard data and API’s. There is an alternative. We can make progress now, by using the data and API’s that are actually available.

I have proposed to fix this problem with a project that I dub the Open Data Acquisition Cloud. This will convert every participating data source into a clean API with standard semantics. The API will be defined by the customers — the people who are demanding data. The implementation is simple: a set of open source scripts, one for each source of data, that can read the data, understand the idiosyncrasies of the source, and interpret it into a standard format. We can get this working quickly, and keep it working, by running continuous integration, which simply means that we run the scripts every day and make sure they work. Stay tuned as we ramp up this project.

This will happen quickly, and after we get it implemented, we can proceed to the next step.

3) There will be an index that shows the locations of the data about a person, and indicates who is permitted to get it

Inevitably, data about me is accumulating in multiple, scattered locations. As much as my local monopolistic health care provider would like to track me in one place, that doesn’t happen, even for basic medical records. I move around to different providers with different systems. Their big EMR starts to look small if we consider newer and bigger sources of data, such as my fitness tracker and scale and ambient devices, my apps with their ability to sense, watch, and interview, my genomic tests, etc. The old EMR data does not satisfy the new demand.

I would like my advisers to have a coherent view of my health data. So, we will need an index that tracks where this data is. When someone creates a new bundle of data about me, they should register it. Then, I can come along and add permissions that authorize others to go get it.

This index of locations and permissions is a good use case for blockchain technology, or for our upcoming Trust but Verify database (blockchain applications at cloud scale). The index should be a public resource that anyone can add to, and that I and my health advisers can always access. It’s small enough to fit in a shared database.

This will take longer. It will be the underlying technology for the US government supported “consumer directed exchange.”

4) Big data about a person must eventually come to one place for analysis

AI advisers thrive on bigger, richer datasets. They will be happy because more and more data is available about me, and it’s getting bigger. It’s going far beyond doctor’s notes and prescriptions to full images, videos, genomes, biomes, proteomes, and ambient sensor streams. For my advisers to work effectively, they will need to gather the data in one place and look at it.

Actually, I will gather the data in one place — a personal data store- and invite AI advisers to come look at it. They will make house calls. They will fit into code containers, and we will bring the code to the data.

We will need to keep the data in one place because it will be too big to move around on the network. It already takes minutes to move one batch of images or genomic data to one point of analysis. This problem will get worse because the volume of data is growing faster than network capacity. Roughly, data grows at 100%/year, and network bandwidth grows at less than 50% per year. So, five years from now it will take four times longer to move a typical data set. And, we will have more places to send it as a diverse group of AI experts hone their craft.

When we have true AI code working with true big data, we will need to keep a personal data store, and bring the code to the data.

Roadmap

Effective data sharing is unlikely to happen without these steps. If you are working on a data sharing topology that does not include these steps, it probably won’t work. Without demand, we won’t be able to fund our projects. Then, we could wait forever for data providers to implement standard data access. Eventually, we will realize that we need step 2— an ability to interpret the data that is available. If we want consumers to be able to grant apps access to their data, the app has to find the data, so we need step 3— an index. Later, we might find that Apple or Google’s massive and monopolistic brain is giving us some sort of health advice based on its database, but we will also find that there are other AI advisers that are better and more specialized, so we will need step 4— a place to bring them so they can examine and advise us.

I think these things will happen sequentially. We already are seeing demand. Next, we will figure out how to use the data that is out there, one source at a time. Then, we will track it down to get a more complete view of each person’s data footprint. Then we can retrieve it and make it available to our amazing AI advisers.

If you are working on health data sharing, don’t waste your time on doomed architecture. Play to win. Take steps toward an architecture that will be successful.