Data Acquisition Service
Service that acquires data based on reporting requirements
Overview
The Data Acquisition service is responsible for connecting and querying a tenant’s endpoint for FHIR resources that are needed to evaluate patients for a measure. For Epic installations, Link Cloud is utilizing the Epic FHIR STU3 Patient List resource to inform which patients are currently admitted in the facility. While this is the current solution to acquiring the patient census, there are other means of patient acquisition being investigated (ADT V2, Bulk FHIR) to provide universal support across multiple EHR vendors.
Nodes
Common Configurations
- Swagger
- Azure App Configuration
- Kafka Configuration
- Kafka Consumer Retry Configuration
- Service Registry Configuration
- CORS Configuration
- Token Service Configuration
- Service Authentication
- SQL Server Database Configuration
- Data Acquisition Service Configuration
Features and Functionality
Data Acquisition is a crucial step in the report generation pipeline, responsible for obtaining clinical data from external systems, such as FHIR R4 endpoints within electronic health record (EHR) systems. By systematically acquiring and managing data, it ensures that downstream processes like evaluation and reporting are equipped with the necessary information.
Key Roles of Data Acquisition
- Patient List Acquisition:
- Retrieves a FHIR List of patients from the EHR, if configured.
- Serves as the initial step for identifying patients relevant to quality measure evaluations.
- Individual Patient Data Acquisition:
- Acquires detailed FHIR data for individual patients identified in the census.
- Includes all essential data elements required for initial measure evaluations.
- Supplemental Data Acquisition:
- Retrieves additional data elements that may not be needed for initial evaluations but are desirable for complete submission and reporting.
Where Data Acquisition Fits in the Pipeline
Data acquisition plays a role in three distinct stages of the report generation pipeline:
- Patient Identification:
- Acquiring a list of patients from the EHR to determine the cohort for evaluation.
- Initial Data Collection and Evaluation:
- Obtaining the primary dataset needed for evaluating quality measures.
- Determines whether a patient qualifies for reporting based on initial measure criteria.
- Supplemental Data Collection:
- Acquiring additional, non-essential data to enrich the submission.
- Completes the dataset for comprehensive evaluation and reporting.
Progressive Querying
To optimize data acquisition, the system employs a technique called Progressive Querying.
During progressive querying, data is acquired in stages to meet the evaluation pipeline’s needs. It flows between services via Kafka topics/events, starting from data acquisition to normalization, through initial evaluation to determine patient relevance, and back to acquire supplemental data, which is then normalized and re-evaluated.
This method minimizes the data retrieved from the EHR by acquiring only what is necessary at each stage of the pipeline:
- Initial Querying: Focuses on essential data needed to evaluate measures and determine patient inclusion.
- Supplemental Querying: Retrieves additional data elements after initial evaluations confirm patient relevance.
- Final Evaluation: Combines initial and supplemental data for comprehensive measure evaluation and reporting.
Benefits of Progressive Querying
- Efficiency: Reduces the volume of data retrieved from the EHR, optimizing system performance.
- Precision: Focuses on acquiring only data that is needed for specific stages in the pipeline.
- Scalability: Supports large-scale operations by limiting unnecessary data transfers.
Bulk FHIR in Data Acquisition
Bulk FHIR is a mechanism under exploration for acquiring data efficiently. However, several limitations impact its general use in data acquisition workflows:
- Challenges with Bulk FHIR:
- Most implementations lack sufficient support for acquiring specific patient data.
- Filtering returned data is often not robust enough.
- To align with the goal of acquiring only necessary data, the system does not currently implement Bulk FHIR for general initial or supplemental data acquisition.
- Use Cases for Bulk FHIR:
- Patient Census Identification:
- Bulk FHIR is a viable solution for identifying the “census of patients,” analogous to using the FHIR “List” endpoint.
- It can acquire “Patient” resources for a group of patients associated with a query, filter, or registry in the EHR.
- This use case is limited to identifying patients of interest and does not address broader data acquisition.
Configuration for Data Acquisition
Data acquisition is configurable per tenant, ensuring flexibility to accommodate diverse EHR systems and data requirements. Key configurable parameters include:
-
Base FHIR URL:
- For general data acquisition.
- For FHIR List (patients of interest) retrieval.
-
Authentication Information:
- Such as client credentials (e.g., “client id”).
-
Patient Census Retrieval:
- FHIR List “id” or Bulk FHIR “Group ID” used for identifying the patient cohort.
-
EHR Query Throttling/Limitations:
- Configurable settings to respect EHR query limitations (e.g., maximum queries per minute).
Configuring Query Plans
Types of query plans:
- Daily
- Weekly
- Monthly
- Discharge
“Discharge” query plans are used when a patient is discharged from the hospital. This plan is triggered by a discharge event and is used to acquire data for the patient.
All other types of query plans are used to acquire data for patients who are currently in the hospital triggered by the end date/time of the scheduled report. The tenant’s timezone is used for this so that if the reporting period ends at 12:59:59 PM, that represents 12:59:59 PM in the tenant’s timezone, not UTC time.
All times are stored in UTC format. The tenant’s time zone is configured with a valid value from IANA.
Initial / Supplemental Queries
The previously mentioned progressive query phases (initial, supplemental) are configurable through the query plan. Each configured phase can contain a list of FHIR resources that must be acquired from the configured endpoint.
Query Types
For each FHIR resource that must be queried, there are two main query types that are supported:
- Parameter: Parameters that will be appended to the FHIR search for the configured resource.
- Reference: Any FHIR references for the configured resource found in other acquired resources for that phase will be queried for.
Example Plan
Below is an example of a monthly query plan that’s configured to acquire the following resources:
- Initial Query Phase:
| Resource | Query Type | Description |
|---|---|---|
| Patient | N/A | Patient resources will always be queried for each configured query plan. No configuration is needed. |
| Encounter | Parameter | The following parameters are included in the search: patient id, period start date and period end date |
| Location | Reference | Any Location FHIR references found in other acquired initial resources will be queried for. ‘SearchPost’ will perform an HTTP POST search for Locations rather than a GET. If an OperationType is not added, it will default to performing a GET search. Link Here for more info on FHIR searches. |
- Supplemental Query Phase:
| Resource | Query Type | Description |
|---|---|---|
| MedicationRequest | Parameter | The following parameters are included in the search: patient Id, period start date, period end date, and the literal value ‘order’ in the MedicationRequest.intent element. |
| Medication | Reference | Any Medication FHIR references found in other acquired supplemental resources will be queried for. |
{ "PlanName": "NHSNdQMAcuteCareHospitalInitialPopulation", "FacilityId": "st-marys-hospital", "EHRDescription": "", "LookBack": "P0D", "Type": "Monthly", "InitialQueries": { "0": { "ResourceType": "Encounter", "QueryConfigType": "Parameter", "Parameters": [ { "ParameterType": "Variable", "Name": "patient", "Variable": 0, "Format": null }, { "ParameterType": "Variable", "Name": "date", "Variable": 1, "Format": "ge{0}" }, { "ParameterType": "Variable", "Name": "date", "Variable": 3, "Format": "le{0}" } ] }, "1": { "QueryConfigType": "Reference", "ResourceType": "Location", "OperationType": "SearchPost", "Paged": 100 } }, "SupplementalQueries": { "0": { "QueryConfigType": "Parameter", "ResourceType": "MedicationRequest", "Parameters": [ { "ParameterType": "Variable", "Name": "patient", "Variable": 0, "Format": null }, { "ParameterType": "Variable", "Name": "authoredon", "Variable": 1, "Format": "ge{0}" }, { "ParameterType": "Variable", "Name": "authoredon", "Variable": 3, "Format": "le{0}" }, { "ParameterType": "Literal", "Name": "intent", "Literal": "order" } ] }, "1": { "QueryConfigType": "Reference", "ResourceType": "Medication", "OperationType": "SearchPost", "Paged": 100 } }}Configuring Census and Data Sources
Data sources (where the FHIR server is located and how to authenticate) are configured via “Query Configs”. There is currently no association between a query plan and data source. Whenever data acquisition attempts to execute a query plan against a data source, it uses the FHIR server and authentication method specified by the “Query Config”, for the specified facility/tenant.
TODO: Add details about how to authenticate against Epic, Cerner, Basic, and/or OAuth data sources.
Query Plans and Acquisition Logs
The Data Acquisition Service uses Query Plans to define the strategy for retrieving data. These plans are translated into Data Acquisition Logs, which represent discrete units of work to be executed by the worker service.
Acquisition Log Lifecycle
Each acquisition task follows a strictly managed state machine:
- Scheduled: The log entry is created but not yet ready for execution. This usually happens when a DataAcquisitionRequested event is received.
- Ready: The system has determined that the log is eligible for execution. A Ready to Acquire Ready to Acquire Event v0.6.0 Indicates when a data acquisition log is ready to be processed Schema Map View docs event is produced for the worker.
- Queued: The worker has received the Ready to Acquire Ready to Acquire Event v0.6.0 Indicates when a data acquisition log is ready to be processed Schema Map View docs event and successfully “claimed” the log.
- InProgress: The worker is actively querying the FHIR endpoint and processing resources.
- Completed: All resources for the log have been successfully acquired and normalized.
- Failed: An error occurred during acquisition that exceeded the maximum retry attempts.
Log Creation Process
When a DataAcquisitionRequested event is received, the service:
- Retrieves the appropriate Query Plan for the facility and report type.
- Identifies the target patients (either from the event itself or by creating a Census log).
- Generates a set of
DataAcquisitionLogentries for each required resource type and patient, assigned to the Initial phase.
Execution and Dependency Management
The system determines when a log should be executed based on its Status and QueryPhase:
- Phase-Based Execution: Logs are typically executed in phases. Initial phase logs are created first. Once initial data is acquired and evaluated by downstream services, they may trigger Supplemental acquisition by sending new DataAcquisitionRequested events with a supplemental phase flag.
- Wait Logic: Supplemental logs are not created until the initial evaluation confirms the patient’s relevance to the report. This prevents unnecessary data retrieval for patients who do not meet the report’s criteria.
- Retry Mechanism: If a log fails due to transient issues (e.g., network timeout), it is incremented and returned to a
PendingorScheduledstate for retry, up to a maximum of 5 attempts.
Query Plan Structure
A Query Plan relates to logs by defining the “blueprint” for their creation:
- ResourceType: The FHIR resource to be queried (e.g., Patient, Encounter, Observation).
- QueryConfigType:
Parameter: Appends specific FHIR search parameters (e.g.,date=ge2024-01-01).Reference: Instructs the system to find references to this resource type within other already acquired resources.
- QueryPhase: Categorizes the query into
InitialorSupplemental.
For more details on how these logs are processed, see the Data Acquisition Worker Service.
Tail Messages
Data Acquisition emits a tail Resource Acquired Resource Acquired Event
v0.6.0 Represents a single FHIR resource retrieved from a facility.
Schema
Map
View docs event for each (patientId, reportTrackingId, queryType) tuple when all resources for the phase are produced. This tail sets acquisitionComplete = true and signals downstream services to proceed. See:
- Events → Resource Acquired Resource Acquired Event v0.6.0 Represents a single FHIR resource retrieved from a facility. Schema Map View docs
- Docs → Tail Messages: Patient Completion Signals
Logging
The Logging configuration defines the logging levels for different parts of the application.
| Property | Description | Required | Default Value | Secret? |
|---|---|---|---|---|
| Logging__LogLevel__Default | Default log level | No | No | |
| Logging__LogLevel__Microsoft.AspNetCore | Log level for asp net core logs | No | No | |
| Logging__LogLevel__System | Level for system logs | No | No |
Database Schema
Persistence schema for the Data Acquisition Service (SQL Server)