Empi 69 and candidates (#1936)

broaden empi blocking searches to support and searchparams as well as ors
This commit is contained in:
Ken Stevens 2020-06-24 09:12:56 -04:00 committed by GitHub
parent e65c264927
commit 38a2b00663
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
12 changed files with 204 additions and 58 deletions

View File

@ -11,15 +11,15 @@ Here is an example of a full HAPI EMPI rules json document:
"candidateSearchParams": [
{
"resourceType": "Patient",
"searchParam": "birthdate"
"searchParams": ["given", "family"]
},
{
"resourceType": "*",
"searchParam": "identifier"
"searchParams": ["identifier"]
},
{
"resourceType": "Patient",
"searchParam": "general-practitioner"
"searchParams": ["general-practitioner"]
}
],
"candidateFilterSearchParams": [
@ -56,18 +56,21 @@ Here is an example of a full HAPI EMPI rules json document:
Here is a description of how each section of this document is configured.
* **candidateSearchParams**: These define fields which must have at least one exact match before two resources are considered for matching. This is like a list of "pre-searches" that find potential candidates for matches, to avoid the expensive operation of running a match score calculation on all resources in the system. E.g. you may only wish to consider matching two Patients if they either share at least one identifier in common or have the same birthday. The HAPI FHIR server executes each of these searches separately and then takes the union of the results, so you can think of these as `OR` criteria that cast a wide net for potential candidates. In some EMPI systems, these "pre-searches" are called "blocking" searches (since they identify "blocks" of candidates that will be searched for matches).
### candidateSearchParams
These define fields which must have at least one exact match before two resources are considered for matching. This is like a list of "pre-searches" that find potential candidates for matches, to avoid the expensive operation of running a match score calculation on all resources in the system. E.g. you may only wish to consider matching two Patients if they either share at least one identifier in common or have the same birthday. The HAPI FHIR server executes each of these searches separately and then takes the union of the results, so you can think of these as `OR` criteria that cast a wide net for potential candidates. In some EMPI systems, these "pre-searches" are called "blocking" searches (since they identify "blocks" of candidates that will be searched for matches).
```json
[ {
"resourceType" : "Patient",
"searchParam" : "birthdate"
"searchParams" : ["given", "family"]
}, {
"resourceType" : "Patient",
"searchParam" : "identifier"
} ]
```
* **candidateFilterSearchParams** When searching for match candidates, only resources that match this filter are considered. E.g. you may wish to only search for Patients for which active=true. Another way to think of these filters is all of them are "AND"ed with each candidateSearchParam above.
### candidateFilterSearchParams
When searching for match candidates, only resources that match this filter are considered. E.g. you may wish to only search for Patients for which active=true. Another way to think of these filters is all of them are "AND"ed with each candidateSearchParam above.
```json
[ {
"resourceType" : "Patient",
@ -76,7 +79,35 @@ Here is a description of how each section of this document is configured.
} ]
```
* **matchFields** Once the match candidates have been found, they are then each compared to the incoming Patient resource. This comparison is made across a list of `matchField`s. Each matchField returns `true` or `false` indicating whether the candidate and the incoming Patient match on that field. There are two types of metrics: `Matcher` and `Similarity`. Matcher metrics return a `true` or `false` directly, whereas Similarity metrics return a score between 0.0 (no match) and 1.0 (exact match) and this score is translated to a `true/false` via a `matchThreshold`. E.g. if a `JARO_WINKLER` matchField is configured with a `matchThreshold` of 0.8 then that matchField will return `true` if the `JARO_WINKLER` similarity evaluates to a score >= 8.0.
For example, if the incoming patient looked like this:
```json
{
"resourceType": "Patient",
"id": "example",
"identifier": [{
"system": "urn:oid:1.2.36.146.595.217.0.1",
"value": "12345"
}],
"name": [
{
"family": "Chalmers",
"given": [
"Peter",
"James"
]
}
}
```
then the above `candidateSearchParams` and `candidateFilterSearchParams` would result in the following two consecutive searches for candidates:
* `Patient?given=Peter,James&family=Chalmers&active=true`
* `Patient?identifier=urn:oid:1.2.36.146.595.217.0.1|12345&active=true`
### matchFields
Once the match candidates have been found, they are then each compared to the incoming Patient resource. This comparison is made across a list of `matchField`s. Each matchField returns `true` or `false` indicating whether the candidate and the incoming Patient match on that field. There are two types of metrics: `Matcher` and `Similarity`. Matcher metrics return a `true` or `false` directly, whereas Similarity metrics return a score between 0.0 (no match) and 1.0 (exact match) and this score is translated to a `true/false` via a `matchThreshold`. E.g. if a `JARO_WINKLER` matchField is configured with a `matchThreshold` of 0.8 then that matchField will return `true` if the `JARO_WINKLER` similarity evaluates to a score >= 8.0.
By default, all matchFields have `exact=false` which means that they will have all diacritical marks removed and converted to upper case before matching. `exact=true` can be added to any matchField to compare the strings as they are originally capitalized and accented.
@ -250,7 +281,9 @@ The following metrics are currently supported:
</tbody>
</table>
* **matchResultMap** converts combinations of successful matchFields into an EMPI Match Result for overall matching of a given pair of resources. MATCH results are evaluated take precedence over POSSIBLE_MATCH results.
### matchResultMap
These entries convert combinations of successful matchFields into an EMPI Match Result for overall matching of a given pair of resources. MATCH results are evaluated take precedence over POSSIBLE_MATCH results.
```json
{
@ -261,4 +294,6 @@ The following metrics are currently supported:
}
```
* **eidSystem**: The external EID system that the HAPI EMPI system should expect to see on incoming Patient resources. Must be a valid URI. See [EMPI EID](/hapi-fhir/docs/server_jpa_empi/empi_eid.html) for details on how EIDs are managed by HAPI EMPI.
### eidSystem
The external EID system that the HAPI EMPI system should expect to see on incoming Patient resources. Must be a valid URI. See [EMPI EID](/hapi-fhir/docs/server_jpa_empi/empi_eid.html) for details on how EIDs are managed by HAPI EMPI.

View File

@ -37,6 +37,7 @@ import ca.uhn.fhir.jpa.empi.broker.EmpiMessageHandler;
import ca.uhn.fhir.jpa.empi.broker.EmpiQueueConsumerLoader;
import ca.uhn.fhir.jpa.empi.interceptor.EmpiStorageInterceptor;
import ca.uhn.fhir.jpa.empi.interceptor.IEmpiStorageInterceptor;
import ca.uhn.fhir.jpa.empi.svc.EmpiCandidateSearchCriteriaBuilderSvc;
import ca.uhn.fhir.jpa.empi.svc.EmpiCandidateSearchSvc;
import ca.uhn.fhir.jpa.empi.svc.EmpiEidUpdateService;
import ca.uhn.fhir.jpa.empi.svc.EmpiLinkQuerySvcImpl;
@ -157,6 +158,11 @@ public class EmpiConsumerConfig {
return new EmpiCandidateSearchSvc();
}
@Bean
EmpiCandidateSearchCriteriaBuilderSvc empiCriteriaBuilderSvc() {
return new EmpiCandidateSearchCriteriaBuilderSvc();
}
@Bean
EmpiResourceMatcherSvc empiResourceComparatorSvc(FhirContext theFhirContext, IEmpiSettings theEmpiConfig) {
return new EmpiResourceMatcherSvc(theFhirContext, theEmpiConfig);

View File

@ -0,0 +1,45 @@
package ca.uhn.fhir.jpa.empi.svc;
import ca.uhn.fhir.empi.rules.json.EmpiResourceSearchParamJson;
import org.hl7.fhir.instance.model.api.IAnyResource;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import javax.annotation.Nonnull;
import java.util.ArrayList;
import java.util.List;
import java.util.Optional;
@Service
public class EmpiCandidateSearchCriteriaBuilderSvc {
@Autowired
private EmpiSearchParamSvc myEmpiSearchParamSvc;
/*
* Given a list of criteria upon which to block, a resource search parameter, and a list of values for that given search parameter,
* build a query url. e.g.
*
* Patient?active=true&name.given=Gary,Grant&name.family=Graham
*/
@Nonnull
public Optional<String> buildResourceQueryString(String theResourceType, IAnyResource theResource, List<String> theFilterCriteria, EmpiResourceSearchParamJson resourceSearchParam) {
List<String> criteria = new ArrayList<>();
resourceSearchParam.iterator().forEachRemaining(searchParam -> {
//to compare it to all known PERSON objects, using the overlapping search parameters that they have.
List<String> valuesFromResourceForSearchParam = myEmpiSearchParamSvc.getValueFromResourceForSearchParam(theResource, searchParam);
if (!valuesFromResourceForSearchParam.isEmpty()) {
criteria.add(buildResourceMatchQuery(searchParam, valuesFromResourceForSearchParam));
}
});
if (criteria.isEmpty()) {
return Optional.empty();
}
criteria.addAll(theFilterCriteria);
return Optional.of(theResourceType + "?" + String.join("&", criteria));
}
private String buildResourceMatchQuery(String theSearchParamName, List<String> theResourceValues) {
return theSearchParamName + "=" + String.join(",", theResourceValues);
}
}

View File

@ -35,13 +35,12 @@ import org.slf4j.Logger;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import javax.annotation.Nonnull;
import java.util.ArrayList;
import java.util.Collection;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Optional;
import java.util.stream.Collectors;
import static ca.uhn.fhir.empi.api.EmpiConstants.ALL_RESOURCE_SEARCH_PARAM_TYPE;
@ -58,6 +57,11 @@ public class EmpiCandidateSearchSvc {
private DaoRegistry myDaoRegistry;
@Autowired
private IdHelperService myIdHelperService;
@Autowired
private EmpiCandidateSearchCriteriaBuilderSvc myEmpiCandidateSearchCriteriaBuilderSvc;
public EmpiCandidateSearchSvc() {
}
/**
* Given a target resource, search for all resources that are considered an EMPI match based on defined EMPI rules.
@ -81,13 +85,7 @@ public class EmpiCandidateSearchSvc {
continue;
}
//to compare it to all known PERSON objects, using the overlapping search parameters that they have.
List<String> valuesFromResourceForSearchParam = myEmpiSearchParamSvc.getValueFromResourceForSearchParam(theResource, resourceSearchParam);
if (valuesFromResourceForSearchParam.isEmpty()) {
continue;
}
searchForIdsAndAddToMap(theResourceType, matchedPidsToResources, filterCriteria, resourceSearchParam, valuesFromResourceForSearchParam);
searchForIdsAndAddToMap(theResourceType, theResource, matchedPidsToResources, filterCriteria, resourceSearchParam);
}
//Obviously we don't want to consider the freshly added resource as a potential candidate.
//Sometimes, we are running this function on a resource that has not yet been persisted,
@ -111,9 +109,13 @@ public class EmpiCandidateSearchSvc {
* 4. Store all results in `theMatchedPidsToResources`
*/
@SuppressWarnings("rawtypes")
private void searchForIdsAndAddToMap(String theResourceType, Map<Long, IAnyResource> theMatchedPidsToResources, List<String> theFilterCriteria, EmpiResourceSearchParamJson resourceSearchParam, List<String> theValuesFromResourceForSearchParam) {
private void searchForIdsAndAddToMap(String theResourceType, IAnyResource theResource, Map<Long, IAnyResource> theMatchedPidsToResources, List<String> theFilterCriteria, EmpiResourceSearchParamJson resourceSearchParam) {
//1.
String resourceCriteria = buildResourceQueryString(theResourceType, theFilterCriteria, resourceSearchParam, theValuesFromResourceForSearchParam);
Optional<String> oResourceCriteria = myEmpiCandidateSearchCriteriaBuilderSvc.buildResourceQueryString(theResourceType, theResource, theFilterCriteria, resourceSearchParam);
if (!oResourceCriteria.isPresent()) {
return;
}
String resourceCriteria = oResourceCriteria.get();
ourLog.debug("Searching for {} candidates with {}", theResourceType, resourceCriteria);
//2.
@ -139,24 +141,6 @@ public class EmpiCandidateSearchSvc {
}
}
/*
* Given a list of criteria upon which to block, a resource search parameter, and a list of values for that given search parameter,
* build a query url. e.g.
*
* Patient?active=true&name.given=Gary,Grant
*/
@Nonnull
private String buildResourceQueryString(String theResourceType, List<String> theFilterCriteria, EmpiResourceSearchParamJson resourceSearchParam, List<String> theValuesFromResourceForSearchParam) {
List<String> criteria = new ArrayList<>(theFilterCriteria);
criteria.add(buildResourceMatchQuery(resourceSearchParam.getSearchParam(), theValuesFromResourceForSearchParam));
return theResourceType + "?" + String.join("&", criteria);
}
private String buildResourceMatchQuery(String theSearchParamName, List<String> theResourceValues) {
return theSearchParamName + "=" + String.join(",", theResourceValues);
}
private List<String> buildFilterQuery(List<EmpiFilterSearchParamJson> theFilterSearchParams, String theResourceType) {
return Collections.unmodifiableList(theFilterSearchParams.stream()
.filter(spFilterJson -> paramIsOnCorrectType(theResourceType, spFilterJson))

View File

@ -23,7 +23,6 @@ package ca.uhn.fhir.jpa.empi.svc;
import ca.uhn.fhir.context.FhirContext;
import ca.uhn.fhir.context.RuntimeResourceDefinition;
import ca.uhn.fhir.context.RuntimeSearchParam;
import ca.uhn.fhir.empi.rules.json.EmpiResourceSearchParamJson;
import ca.uhn.fhir.jpa.searchparam.MatchUrlService;
import ca.uhn.fhir.jpa.searchparam.SearchParameterMap;
import ca.uhn.fhir.jpa.searchparam.extractor.SearchParamExtractorService;
@ -51,9 +50,9 @@ public class EmpiSearchParamSvc implements ISearchParamRetriever {
return myMatchUrlService.translateMatchUrl(theResourceCriteria, resourceDef);
}
public List<String> getValueFromResourceForSearchParam(IBaseResource theResource, EmpiResourceSearchParamJson theFilterSearchParam) {
public List<String> getValueFromResourceForSearchParam(IBaseResource theResource, String theSearchParam) {
String resourceType = myFhirContext.getResourceType(theResource);
RuntimeSearchParam activeSearchParam = mySearchParamRegistry.getActiveSearchParam(resourceType, theFilterSearchParam.getSearchParam());
RuntimeSearchParam activeSearchParam = mySearchParamRegistry.getActiveSearchParam(resourceType, theSearchParam);
return mySearchParamExtractorService.extractParamValuesAsStrings(activeSearchParam, theResource);
}

View File

@ -0,0 +1,69 @@
package ca.uhn.fhir.jpa.empi.svc;
import ca.uhn.fhir.empi.rules.json.EmpiResourceSearchParamJson;
import ca.uhn.fhir.jpa.empi.BaseEmpiR4Test;
import org.hl7.fhir.r4.model.HumanName;
import org.hl7.fhir.r4.model.Patient;
import org.junit.Test;
import org.springframework.beans.factory.annotation.Autowired;
import java.util.Collections;
import java.util.Optional;
import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.Matchers.anyOf;
import static org.hamcrest.Matchers.equalTo;
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertFalse;
import static org.junit.Assert.assertTrue;
public class EmpiCandidateSearchCriteriaBuilderSvcTest extends BaseEmpiR4Test {
@Autowired
EmpiCandidateSearchCriteriaBuilderSvc myEmpiCandidateSearchCriteriaBuilderSvc;
@Test
public void testEmptyCase() {
Patient patient = new Patient();
EmpiResourceSearchParamJson searchParamJson = new EmpiResourceSearchParamJson();
searchParamJson.addSearchParam("family");
Optional<String> result = myEmpiCandidateSearchCriteriaBuilderSvc.buildResourceQueryString("Patient", patient, Collections.emptyList(), searchParamJson);
assertFalse(result.isPresent());
}
@Test
public void testSimpleCase() {
Patient patient = new Patient();
patient.addName().setFamily("Fernandez");
EmpiResourceSearchParamJson searchParamJson = new EmpiResourceSearchParamJson();
searchParamJson.addSearchParam("family");
Optional<String> result = myEmpiCandidateSearchCriteriaBuilderSvc.buildResourceQueryString("Patient", patient, Collections.emptyList(), searchParamJson);
assertTrue(result.isPresent());
assertEquals("Patient?family=Fernandez", result.get());
}
@Test
public void testComplexCase() {
Patient patient = new Patient();
HumanName humanName = patient.addName();
humanName.addGiven("Jose");
humanName.addGiven("Martin");
humanName.setFamily("Fernandez");
EmpiResourceSearchParamJson searchParamJson = new EmpiResourceSearchParamJson();
searchParamJson.addSearchParam("given");
searchParamJson.addSearchParam("family");
Optional<String> result = myEmpiCandidateSearchCriteriaBuilderSvc.buildResourceQueryString("Patient", patient, Collections.emptyList(), searchParamJson);
assertTrue(result.isPresent());
assertThat(result.get(), anyOf(equalTo("Patient?given=Jose,Martin&family=Fernandez"), equalTo("Patient?given=Martin,Jose&family=Fernandez")));
}
@Test
public void testIdentifier() {
Patient patient = new Patient();
patient.addIdentifier().setSystem("urn:oid:1.2.36.146.595.217.0.1").setValue("12345");
EmpiResourceSearchParamJson searchParamJson = new EmpiResourceSearchParamJson();
searchParamJson.addSearchParam("identifier");
Optional<String> result = myEmpiCandidateSearchCriteriaBuilderSvc.buildResourceQueryString("Patient", patient, Collections.emptyList(), searchParamJson);
assertTrue(result.isPresent());
assertEquals(result.get(), "Patient?identifier=urn:oid:1.2.36.146.595.217.0.1|12345");
}
}

View File

@ -25,7 +25,7 @@ public class EmpiCandidateSearchSvcTest extends BaseEmpiR4Test {
public void testFindCandidates() {
Patient jane = buildJanePatient();
jane.setActive(true);
Patient createdJane = createPatient(jane);
createPatient(jane);
Patient newJane = buildJanePatient();
Collection<IAnyResource> result = myEmpiCandidateSearchSvc.findCandidates("Patient", newJane);

View File

@ -2,15 +2,15 @@
"candidateSearchParams": [
{
"resourceType": "Patient",
"searchParam": "birthdate"
"searchParams": ["birthdate"]
},
{
"resourceType": "*",
"searchParam": "identifier"
"searchParams": ["identifier"]
},
{
"resourceType": "Patient",
"searchParam": "general-practitioner"
"searchParams": ["general-practitioner"]
}
],
"candidateFilterSearchParams": [

View File

@ -67,8 +67,9 @@ public class EmpiRuleValidator {
}
private void validateSearchParams(EmpiRulesJson theEmpiRulesJson) {
for (EmpiResourceSearchParamJson searchParam : theEmpiRulesJson.getCandidateSearchParams()) {
validateSearchParam("candidateSearchParams", searchParam.getResourceType(), searchParam.getSearchParam());
for (EmpiResourceSearchParamJson searchParams : theEmpiRulesJson.getCandidateSearchParams()) {
searchParams.iterator().forEachRemaining(
searchParam -> validateSearchParam("candidateSearchParams", searchParams.getResourceType(), searchParam));
}
for (EmpiFilterSearchParamJson filter : theEmpiRulesJson.getCandidateFilterSearchParams()) {
validateSearchParam("candidateFilterSearchParams", filter.getResourceType(), filter.getSearchParam());
@ -129,7 +130,7 @@ public class EmpiRuleValidator {
private void validatePatientPath(EmpiFieldMatchJson theFieldMatch) {
try {
myTerser.getDefinition(myPatientClass, "Patient." + theFieldMatch.getResourcePath());
} catch (DataFormatException|ConfigurationException e) {
} catch (DataFormatException | ConfigurationException e) {
throw new ConfigurationException("MatchField " +
theFieldMatch.getName() +
" resourceType " +

View File

@ -23,14 +23,18 @@ package ca.uhn.fhir.empi.rules.json;
import ca.uhn.fhir.model.api.IModelJson;
import com.fasterxml.jackson.annotation.JsonProperty;
import java.util.ArrayList;
import java.util.Iterator;
import java.util.List;
/**
*
*/
public class EmpiResourceSearchParamJson implements IModelJson {
public class EmpiResourceSearchParamJson implements IModelJson, Iterable<String> {
@JsonProperty(value = "resourceType", required = true)
String myResourceType;
@JsonProperty(value = "searchParam", required = true)
String mySearchParam;
@JsonProperty(value = "searchParams", required = true)
List<String> mySearchParams;
public String getResourceType() {
return myResourceType;
@ -41,12 +45,15 @@ public class EmpiResourceSearchParamJson implements IModelJson {
return this;
}
public String getSearchParam() {
return mySearchParam;
public Iterator<String> iterator() {
return mySearchParams.iterator();
}
public EmpiResourceSearchParamJson setSearchParam(String theSearchParam) {
mySearchParam = theSearchParam;
public EmpiResourceSearchParamJson addSearchParam(String theSearchParam) {
if (mySearchParams == null) {
mySearchParams = new ArrayList<>();
}
mySearchParams.add(theSearchParam);
return this;
}
}

View File

@ -37,10 +37,10 @@ public abstract class BaseEmpiRulesR4Test extends BaseR4Test {
EmpiResourceSearchParamJson patientBirthdayBlocking = new EmpiResourceSearchParamJson()
.setResourceType("Patient")
.setSearchParam(Patient.SP_BIRTHDATE);
.addSearchParam(Patient.SP_BIRTHDATE);
EmpiResourceSearchParamJson patientIdentifierBlocking = new EmpiResourceSearchParamJson()
.setResourceType("Patient")
.setSearchParam(Patient.SP_IDENTIFIER);
.addSearchParam(Patient.SP_IDENTIFIER);
EmpiFieldMatchJson lastNameMatchField = new EmpiFieldMatchJson()

View File

@ -1,7 +1,7 @@
{
"candidateSearchParams" : [{
"resourceType" : "Patient",
"searchParam" : "foo"
"searchParams" : ["foo"]
}],
"candidateFilterSearchParams" : [],
"matchFields" : [],