Monday, April 16, 2018

IHE Perspective on EU GDPR

I just became aware of a Whitepaper published by IHE Europe in January on "IHE perspective on EU GDPR".

I did not have a hand in writing this whitepaper. It looks good to me. My evaluation only on the Security & Privacy capabilities IHE offers, not on GDPR interpretation. All of the IHE profiles available to support security and privacy are outlined on this IHE page. Their whitepaper does not mention the Document Digital Signature (DSG) profile, or the Document Encryption (DEN). Both would only have a supporting role in GDPR compliance. I mention them only for completeness.

Other IHE Europe publications

Their Conclusion

The examples discussed [above] highlight the complexity of applying the GDPR to processes in health care and how the requirements are interwoven with IHE Profiles. The good news is that even today IHE Profiles provide solutions by combining security and privacy specific IHE Profiles such as ATNA, IUA, XUA, BPPC and APPC with the Profiles focused on information exchange in cross-border, national or regional ehealth deployments.

In conclusion the GDPR can be an effective catalyst to significantly extend the reach and use of IHE Profiles. Some Profiles or combinations of Profiles already meet GDPR’s security and privacy requirements. Others enable the portability of health information which will become a topic for any vendor providing solutions. 

The users of IHE Profiles can be assured that the IHE community will work on evaluating and enhancing the Profiles to meet the GDPR requirements.

GDPR impact beyond EU

I look forward to GDPR. I think that it will bring a focus to Security and Privacy topics. I hope that enforcement drives adoption, while reasonable enforcement drives reasonable reaction. I fear that an overly strict interpretation of GDPR could drive away some very important advancements in healthcare, and social networking. I welcome the extensive and painful penalties for non compliance.

Thursday, April 12, 2018

De-Duplicating the received duplicate data

Everyone is frustrated by duplicate data. In Healthcare space there is a fresh cry from Clinicians around their frustration at seeing duplicate data. On the bright side, this means that they are now getting data. So we in the Interoperability space MUST be succeeding with all the efforts to create Health Information Exchanges, and to enable Patient to access their data.


We standards geeks are quickly put in our chair because we failed to prevent this duplicate data problem... Well, yes and no. Each standard we created included mechanisms that are there specifically prevent duplication. However when those standards are used, shortcuts are taken. It might be a shortcut in the software development. It might be shortcuts in deploying a network. It might be shortcuts in deploying a network of networks. It might be a shortcut when the data was created. It might be a shortcut when the data was exported. It might be a shortcut when the data was 'Used'... But it is shortcuts, that is where the standard was not used the way it was intended to be used.  

Are these shortcuts bad???? Not necessarily. Many times a shortcut is taken to get a solution working quickly. If no shortcuts were taken, then we wold not be where we are today. Thus shortcuts are good, in the short-term. Shortcuts are only bad when they are not fixed once that shortcut is determined to be presenting a problem. Some shortcuts never present a problem.

Standards solutions to Duplicate Data

Let me explain the things that are in the standards we use today (XDS, XCA, CDA, and Direct) that can be used to prevent duplicate data:
  • Patient Identity -- the protocols used to create the virtual identity out of the many identities given to a Patient by many different organizations. (XCPD, PIX, PDQ, etc)
  • Home Community ID – unique identifier of a community of organization(s) 
  • Patient ID Assigning Authority (AA) – uniquely identifies the authority issuing patient identifiers. Usually one per healthcare organization, although can be assigned at a higher level.
  • Document unique ID – uniquely identifies a document regardless of how it was received (Including when received through Direct or Patient portals)
  • Document Entry Unique ID -- A document entry is metadata about a document, including the document uniqueID. A document entry has a unique ID.
  • Element ID – unlikely to be used today, but the standards support it. Fundamental to FHIR core
  • Provenance - unlikely to be used today, but would uniquely identify the source

Elaboration of these points

This is a complex problem, and many layers are used to solve various parts of that complex problem. Where each layer addresses a specific portion of the complexity.

Discovering the virtual Patient Identity

The protocols like IHE PIX, PDQ, and XCPD are designed to discover the various identifiers that the patient is known by. This is a reality, even in a case where government dictates  national identifier.

Duplicate network pathways

Broadest reason for duplicate data is that there are multiple pathways to the same repository of data (documents). Such as HealtheWay, CommonWell or CareQuality. Use just one of these and you don't have multiple pathways, use more than one and you might. The reason to use more than one is caused by the fact that each network has a subset of overall healthcare providers. The duplication is that some participate in more than one network... just like you are... Thus if no one participates in multiple networks, there is no duplicate pathways. 
Heat-Map for CareQuality Network

You might end up finding that you have two or three pathways to the same healthcare organization. You could just disable specific endpoints through specific networks. Pick to talk to a partner only through one of the networks. I would argue that this method of avoidance will be low tech, and initially effective. However as the network matures and expands we need a method to recognize when a new duplicate pathway happens.

I would argue that having multiple pathways is possibly useful to address major disasters that take out one of the networks, or one of the pathways.

Duplicate pathways are detectable, and when detectable, can be automatically prevented.

Detecting duplicates by homeCommunityID. This is the most reliable, but not perfectly foolproof. This however does require that the participants in these networks use the homeCommunityId as it was intended, as an identifier of a community that uniquely holds data.

Special case of hiding communities: Most configurations of XCA behave, but there are some communities that hide many sub-communities behind them. If these sub-communities are only attached through the one community interface, then there is no problem. This is the likely case for these configurations. These configurations are done this way as convenience to the sub-communities. that is to say the sub-communities like that the larger community adds value and connects them to the world. If one of these sub-communities ever decides to connect to another network, then they must become a full community everywhere, else they become a duplicate data source knowingly.

Preventing duplicate data using the homeCommunityID: So the point is that homeCommunityID is a strong indicator of duplicate pathway that would result in duplicate data. Given that in Patient Discovery (XCPD) you target the patient discovery question to a specific homeCommunityID(s), you are in control of which communities you target. Where you have already gotten a response back from a homeCommunity, you can skip the potentially duplicative Patient Discovery (XCPD) or can ignore the secondary results if you already sent out the question. By having a secondary pathway choice, allows you to dynamically detect that the primary pathway is failing. Yes you would need to identify primary vs secondary preferences; logic for delayed attempts; and handling of delayed responses.

Duplicate Patient Id Assigning Authority (AA)

I first mentioned that there are protocols used to discover all the identifiers that a single patient has. This is made up of a Patient ID and the Assigning Authority (AA) that issued that patient ID.  The patient ID assigning authority (AA) is the second level indicator of a unique organization. This can be used today, because everyone does indeed manage their own patient identities, and thus must have a globally unique AA.

Special case is where a community aggregates patient identities into a community patient identity. Such as will happen in an XDS Affinity Domain. Like the sub-community issue above, this is likely not a problem as those that participate in XDS Affinity Domain tend to be small and only want one connection.

Where a nation issues patient identifier, the Assigning Authority (AA) becomes just the national Assigning Authority and no longer would be useful for de-duplicating. In this case many organizations and communities would use the same assigning authority and patient identity. This does not cause duplicate data, but does make the Assigning Authority less helpful at detecting duplicate data.

Duplicate Document UniqueID

The Document UniqueID is an absolute proof of duplicate documents. The Document UniqueId is readily available in the Document Sharing (XDS/XCA) metadata, so can be used at that level to keep from pulling a document unnecessary. With other networks, like Direct or Patient apps, the Document UniqueId can be found within document types like CDA or FHIR. If a case is ever found where this can’t be used as an absolute proof of duplicate document, then the source of that document must be fixed.

This solution will work regardless of the network. This will work with XDS/XCA based networks, but will also work with FHIR based networks, or where the Patient uses an app of any kind. 

A special mention of on-demand documents, but I will address them below.

Duplicate data element identifiers

The solution that would work absolutely the best, happens to be the one least likely to be available today. 

The standards (CDA and FHIR) include the capability to uniquely identify data elements (resources). However, like a good standard, they allow you to not uniquely identify the data element. Yes, I said this was a good thing. It is a good thing for low-end scale. It is a really bad thing for a mature market. This is where Implementation Guides and Profiles come in. In the case of CDA there are implementation guides that do require each data element be uniquely identified, and that Provenance proof always accompany data. 

However uniquely identifying at the data element level is very expensive. That is it is hard to code, makes the database bigger, adds validation steps, and such. When that data is only used within the EHR, there is no value to all this extra overhead. Thus it is often never designed into an EHR.

Duplicate data thru Provenance

Special mention of Provenance... This is supported by the standards, but very poorly implemented. It is expressly important when a unique piece of data is used beyond the initial use. For example where a lab result was taken for one condition, but it also was found to be helpful in a second diagnosis. Both for the same patient, different conditions or different episodes. This is especially true when that original data was exported from one system and imported into another. So a historic CDA was used at a different treatment encounter. That second use needs to give credit to the first, Provenance. How this factors ino duplicate data is that a CDA document from the second encounter will include the very same data from the first. Now two different documents from two different organizations carry the same data but that data has different element identifier as it exists in two places. The solution is Provenance can show the second instance is a copy of the first.

I have worked with EHR that could tell you where the data came from. If it was imported from a CDA received from some other organization, this was noted. Most of the time these Provenance were empty, thus you assume the data was internally generated. But the capability was there on Import, the database had support for Provenance. Using this data on export is another task, thus an opportunity for shortcut...

I also am the owner of the Provenance resource in FHIR.

Clinically same

This is what most deduplication engines work on, they detect that the data found is already known and presume the data is duplicate.  They leverage any identifiers in the data. But ultimately they are looking at the clinical value and determining that they have the same clinical value.

This works except for longitudinal repetition that is clinically significant (an observation presents and resolves over and over)

Duplicate On-Demand Documents

On-Demand documents present the hardest to deal with case of duplicate data. These are also detectable if they follow the IHE on-demand profile. In that the document entry that advertises the availability of on-demand data has a globally unique and stable identity. Thus you can know that you should NOT request a new on-demand instance be made, because you already know about the data. The problem is that you don't know that that new instance would not contain new data. 

So using the unique ID of this on-demand document entry would need to be carefully handled. Never pulling a new on-demand document, will prevent you from ever learning of new data. However pulling a new on-demand document unnecessarily will cause you to spend energy determining that all the data it contains is data you already knew. This is a false-positive and false-negative.

There are poorly implemented on-demand solutions, that don't follow the IHE specification. They create a new on-demand document entry each time they are queried. This is not correct. There should be one uniquely identified document entry that everyone gets the same. When that document is requested, is when the on-demand generation of the specific document is done. And, that generated document should be stored as a 'snapshot'.  These poorly implemented on-demand solutions will present two totally different document entries each time you query, so if you are querying via duplicate pathways, you will think you have found two totally different sources of unique data.

Good news is that if the generated document is of the highest quality, then the content can quickly be separated into data you know from data that is new. That is to say tha the element level identity and/or Provenance can prevent unnecessary duplication.

Detecting a Duplicate

As you can see there are many identifiers, that when they are found to be EQUAL then you know you have duplicates. I present them from largest scope to smallest scope. The larger scope you can use the less energy it takes to stop processing duplicate data. This solution breaks down when the identifiers are not equal, in that case you are not assured that you do not have duplicate data. Thus the whole spectrum must be used, one level is not enough. Ultimately there will be false-positives and false-negatives.

Organizational Policy driving Maturity

Now that we have Interoperability, we need to address over-Interoperability.  I think that identifying the need for HealtheWay, CareQuality, CommonWell, DirectTrust, and any other networks to have reasonable and good control of their identifiers. There is already strong push to move to more coded documents like C-CDA R2.1. There are efforts around Provider Directories.

I don’t think this is a big effort, most do the right thing already today. What is needed is governance that says that the right behavior is expected, and when improper behavior is found it must be fixed. The current Sequoia specifications do not address this level of detail.

Improvement is always good, but we must recognize that much of the health data is longitudinal, and it is very possible a document was created 10 years ago according to the best possible guidance at that time. That historic document likely contains good data, but does not conform to current best-practice. Postel’s law must guide: Be specific in what you send, liberal in how you receive from others.

IHE Mobile Cross-Enterprise Document Data Element Extraction

I have worked on projects within both IHE and HL7 on these topics. I can’t claim they have solved the issue, but they have raised up the common set of issues to be resolved and gathered good practice as I outline above. The most recent project is one in IHE that starts with the Document Sharing infrastructures (XCA, XDS, and CDA) much like above, and presents the de-duplicated data using FHIR API (QEDm). This solution built upon the family of Document Sharing profiles and FHIR profiles IHE has.

See https://wiki.ihe.net/index.php/Mobile_Cross-Enterprise_Document_Data_Element_Extraction

Tuesday, April 10, 2018

IHE on FHIR tutorial

I will be giving a face-to-face tutorial on the topic of "IHE on FHIR" at both

  1. HL7 Workgroup meeting in Cologne, May 12-18
  2. FHIR Dev Days in Boston, June 19-21 
So, if you are in Europe, sign up for the tutorial at HL7 workgroup meeting.  If you are in the USA, sign up for the tutorial at FHIR Dev Days. 

There is a difference in time available to me. At the FHIR Dev Days I will need to focus only on the IHE Profiles available from IHE that leverage FHIR. Where as at the HL7 meeting I will also be able to discuss the overall relationship between HL7 and IHE and the IHE Profiles available from IHE that leverage FHIR.

Here are the IHE profiles that leverage FHIR today...
  • Radiology
  • Pharmacy
    • Uniform Barcode Processing (UBP) -  describes a way to send the information contained in a barcode and in return receive the parsed content of that barcode in the form of a FHIR resource instance - a medication, a device, a patient, or staff.
    • Mobile Medication Administration (MMA) - defines the integration between healthcare systems and mobile (or any other) clients using RESTful web services. This allows connecting EHRs with smartphones, smart pill boxes, and other personal or professional devices.
  • QRPH
    • Mobile Retrieve Form for Data Capture (mRFD) - provides a method for gathering data within a user's current application to meet the requirements of an external system
    • Vital Records Death Reporting (VRDR) - defines a mRFD content profile that will specify derivation of source content from a medical summary document. by defining requirements for form filler content and form manager handling of the content.:

I have succeeded to get the IHE Profiles listed on the FHIR.org registry of Implementation Guides. This is a manual step today, but will likely continue to be a manual step to assure that what gets published has passed the IHE Governance for being listed.

See my other articles on FHIR

Friday, April 6, 2018

Patient Centered HIE

The Patient is NOT the center of existing Health Information Exchange. Yet, the Health Information Exchange exists for the sole purpose of treating that Patient. These two factual statements are completely opposite. How can they both be facts?


The patient does not feel like the center. It is all about Experience. In agile terms "User Experience". If the user does not feel the experience they want, then they are NOT happy. Does not matter how hard the product is trying to make the user happy. If hey are not feeling it, then they are not happy.

The Health Information Exchanges today have an existing Architecture. This architecture is a good architecture. Yes I am speaking about ALL of the various architectures of Health Information Exchanges. Centralized, Distributed, Federated, and discombobulated. They all are good architectures. The architecture is not the problem. Yet, our healthcare leadership keeps looking for a new technical architecture to solve this problem. 

I assert this is NOT because of the architecture of Health Information Exchanges. BUT rather that this is because of a lack of considering how the would or could Patient Experience the impact available because of the Health Information Exchange.   The architecture supports the following Patient Centered use-cases.

Lets think about the Patient -- User Experience. I offer the Privacy Principles as the use-case to drive this User Experience. You can certainly also consider healthcare treatment use-cases, I won't simply because I am not an expert there. I do fully expect that Care Plans covering chronic care conditions will be the killer-app (sorry) for Health Information Exchanges.

Patients controlling use of their data

Provide the ability of the Patient to set rules for how their data can be used. This is otherwise called Consent, but many people have a very constrained definition of Consent, so I am happy to indicate it is rules for how the data can be used. When I 'used' I also mean the broadest definition. Again some people see "Collection", "Use", and "Disclosure" as special words in Privacy. I don't mean that restricted word 'use', but also includes collection and disclosure and anything else that uses the data.

Yes these rules do need to consider "Treatment" as special, but not uncontrollable. There can still be rules of use in treatment. This recognizes the needs some patients have with sensitive health topics, or concerns about those in the privileged position of treatment.
Emergency Treatment should be seen as a special case of Treatment. Today it is all too often bundled into simply Treatment. I would argue that Emergency Treatment as a use allows a much broader audience of care providers (including Police, Fire, EMT, FEMA), access to a more constrained body of data (a customized Medical Summary holding the critical few data elements), to enable more optimized stabilizing of the patient.
Control of the data also includes allowing the patient to engage their data in use beyond treatment. How about allowing the patient to directly authorize clinical research projects? I should be able to authorize a research project to have full access to my historic and future data.
Controlling the data use is fundamental. The use is not without restrictions, but empowering.

Patients knowing how their data is used

Provide an Access Log. I would say Accounting of Disclosures, but there are simply too many exceptions that this results in a useless, and empty, log. I want an Access Log, that is a log of every time my data was accessed (Direct or Exchange or FHIR). Who requested the access? What did they ask for? What did they get? When was this? Where was this? Why did they access (PurposeOfUse)?
There no network that I know of that provides a view of how the HIE was used to expose the Patient data. I recognize the concern that Covered Entity have that gives us "Normal Operations" exceptions. I don't like these exceptions, but I understand why they exist. I think that ALL accesses over a Health Information Exchange need to be reported to the Patient.

Patients themselves using their data

Provide API access for applications the Patient chooses and authorizes. In the past this would be covered by a statement of "PHR", but that concept is too limited today. This item is inclusive of the older concept of a PHR, but is also inclusive of newer health Apps. Where a PHR was a system that would copy the patient data and give the patient the ability to connect apps to that copy of the data;
Now days we are looking to use FHIR as a way to connect these Apps directly to the data. These apps will tend to just need read-only access, but...
The Patient should be able to be a peer on the Health Information Exchange

Patients authoring their data

Provide capability of the Patient to author data. Many patients, myself included, are using many home-care devices, personal-care devices, health-monitoring devices, and sports related devices. These are producing a wealth of information, much of it is just background measurements. 

These measurements are not accessible to Providers unless they can be contributed on-behalf of the Patient. Clinicians might not want to be bothered by these volumes of data, from uncalibrated devices. But the fact the data is NOT accessible at all is a problem

How about the Patient and their Devices author directly into FHIR. Given this is a new concept, there should be no momentum to use old technology. 

Patients correcting mistakes in their data

Provide the Patient to challenge the validity of data. Once we can see the data, we will surely find some mistakes. Being able to challenge the validity of the data is essential. Even HIPAA acknowledged the need for the Patient to 'Amend" their data. I say challenge as to be closer to "Patient-Centered" or "Patient-Engaged".
This would require some standards development. What does a patient challenge look like, what are the data elements, and trustworthiness of the challenge? How does the patient engage with this mechanism? How does the result of a successful challenge look to users (clinicians) of that data that has been challenged?
I'm sure there are others. I base these on the Privacy Principles

Why NOT?

The reason why ALL of these are not done is because there is no obvious place where the Patient can go to get engaged. There have been attempts, most recently by Apple., technologists think Blockchain is the solution. They all have, and all will continue to fail because the Patient population is not a stakeholder. Each individual is a stakeholder, but  individuals are not powerful enough. They only become a population of stakeholders when they get together with a common demand.  Realistically the population of Patients don't want to pay for this. Follow the lack of money, and you see the problem.

This is not just a USA problem, but is true globally (comments welcome). However in the USA, the government's insistence on forbidding a federal identifier is very detrimental. It keeps anyone from being able to take this role.

So, the Patient must today get their engagement through their GP. Thus the GP is the Patient Experience. Thus how well the GP makes this experience determines how engaged the Patient is.  The GP is more likely to make engagement with the GP a better experience than engaging with other healthcare providers or outside researchers. Thus there are thousands and thousands of very different Patient Experience. It is no wonder then that the Patient is not feeling like the center of the Health Information Exchange.

I have no solution... sorry...  I do know that technology is not the problem. I do know that architecture is not the problem. I do know that standards are not the problem.  I do know that switching from CDA to FHIR is a good thing, but will not fix this. I like blockchain architecture and capability, but it will not fix this problem.

To understand the solution from the perspective of the standards, technology, architecture, and theory... This is my blog

Wednesday, April 4, 2018

Basics of doing Document Sharing Query right

IHE is currently working on a "Handbook" intended to instruct an XDS Affinity Domain, or Community (XCA), or MHD community on how to structure their requirements on metadata. This effort is long overdue, as IHE has relied on the communities to figure this out themselves. More communities have failed than have succeeded. I am just very grateful that they keep trying. Mostly they keep trying because the most basic query is just asking for all documents available for a given Patient. This is necessary, but not sufficient. Let me explain the next level of Document Sharing (XDS, XCA, MHD) Query

As part of the metadata handbook discussion, Charles has clarified in a very elegant way a "Best Practice" for leveraging the XDS/XCA/MHD Query, vs local processing of the resulting document entry/reference Metadata, to achieve the best results. This "Best Practice" should be used, but I know it is not used. The main reason it is not used is because it was never explained to the IHE readership.

XDS Query

The XDS Query transaction has a huge number of stored queries. Realistically there is only ONE stored query that is needed: FindDocuments

The other stored queries are not useless, but are far more special purpose.  They are more focused on SubmissionSets, Folders, and Associations. These are useful, just not very primary for a general purpose Document Consumer.  These are actually essential when one gets into use-cases that require these other capabilities.

Degenerate Query

The fact that FindDocuments is so dominant is exposed more strongly in that some servers only implement FindDocuments, and don't don't support any of the other queries.  This is especially true of XCA (cross-community) which means they don't even support SubmissionSet, Folders, or Associations. This is also true of the Argonaut specification, where SubmissionSets, Folders, and Associations are not supported.

XDS FindDocuments Query Parameters

The FindDocuments query has 18 query parameters. You only need 5 of them. The other 13 parameters are possible to use, but more likely to result in poor results. They are there to assure that FindDocuments is complete, but the use of these additional parameters is very fragile. Fragile in that the consuming system must be in really good synchronization with the publication system, and that is simply unlikely longitudinally over decades. Later I explain how to deal with this fragility.

  1. PatientId -- this is required in XDS, but I will mention it just for completeness. You must have a Patient ID you are interested in. Use of PIX, PDQ, XCPD, or some other Patient Identity Management system is a required prerequisite. 
  2. ClassCode -- this is the most poorly understood metadata element, yet it was intended to be the most powerful. The idea is a major focus of the new IHE "Handbook", where we explain that a small number of vocabulary terms should be allowed, that group documents into logical 'classifications'. Where these classifications are useful to a Document Consumer. That is they should be designed (vocabulary design -- value set) such that for any use-case where someone is looking for documents they can pick one or two terms from this valueset that are most likely to return results.
  3. ServiceStartTimeFrom -- ServiceStopTimeTo -- these work together to give a period of time within which the documents were about. This is different than the creation time, which is when was the document created. The service times are more specific to the time range of the treatment. So for an episode summary, it would have the time range of the episode. Important to note that these two parameters work together to give a period of time, and that period of time can not have a start (beginning of time), or not have an end (end of time). Thus one can ask for documents covering treatment prior to 1998. Another example is only documents covering the last 6 months by specifying a StartTimeFrom and leaving open the stop time.
  4. PracticeSettingCode -- this is the clinical speciality where the act that resulted in the document was performed. Like the classCode, this should have been filled with a controlled valueSet of pre-negotiated vocabulary that represents broad classifications of practice settings. 

Classification ValueSets are critical

What the above shows is that two of the critical FindDocuments query parameters should come from well controlled value sets. Value sets that have a few (10-20) vocabulary values that represent broad classifications. 

These codes need to be useful, but useful to someone doing a Query. Too often these codes are considered when a Document Entry is being published. Yes they need to be filled out when the Document Entry is published, but they need to be useful for Query. 

So, how does a community determine what these valuesets should contain? THAT is the whole purpose of the new IHE metadata "handbook"... I too, await this set of principles, process, and mechanism.

Query is not enough

The whole purpose of the XDS Metadata is to enable processing of the documents so that the right information can be found. The four query parameters are necessary, but not sufficient.

Critical in the "Best Practice" that Charles explained is that a Document Consumer must be ready to some form of local processing. This local processing would leverage ALL of the metadata. This local processing might further eliminate unnecessary entries, might sort the results, might put emphasis on some entries because of specific metadata entries. This local processing might be automated algorithm, or might involve a human. Likely both.

More on Document Sharing Management (Health Information Exchange - HIE)