Validation Service

Service that provides functionality for validating clinical data in the system.

Service Java MSSQL 8075 CPU: 500m-3 RAM: 3Gi-4Gi OpenAPI

Overview

The Validation service is a Java based application that is responsible for validating FHIR resources against the FHIR specification and any additional constraints defined by the Link Cloud tenant. The service utilizes the HAPI FHIR library to perform the validation.

Nodes

Common Configurations

Environment Variables

Env VariableDescriptionType/ValueSecret?
JAVA_TOOL_OPTIONSSpecify min/max Java heap size-Xms1024 -Xmx2048No

Custom Configurations

Property NameDescriptionType/ValueSecret?
artifact.initWhether or not to initialize the artifacts in the database with default artifactstrue (default) or falseNo
link.fhir-terminology-service-urlURL of an external standards-compliant FHIR Terminology server used directly for terminology calls. (First priority)URLNo
link.terminology-service-urlBase URL of the Link Terminology Service; service appends /api/terminology/fhir automatically. (Second priority)URLNo
cache.typeCache implementation used for validate-code callsnone, memory, redisNo
cache.validate-code.ttlTTL for validate-code cache entries (in seconds)integer (seconds)No
spring.data.redis.hostRedis host (used when cache.type=redis)stringNo
spring.data.redis.portRedis port (used when cache.type=redis)integerNo
spring.data.redis.passwordRedis password (used when cache.type=redis)stringYes
spring.data.redis.usernameRedis username (optional; used when cache.type=redis)stringNo
spring.data.redis.databaseRedis database index (used when cache.type=redis)integerNo

Features and Functionality

New Installation Notes

After a new installation of the validation service, the following should be run/executed:

  • /api/aritfact/$initialize endpoint should be run to initialize the database with artifacts that are embedded in the validation service.
  • /api/category/$initialize endpoint should be run to initialize the database with default categories stored in /src/main/resources/categories.json (within the code-base)

PENDING: This functionality is going to be altered so that they are automatically initialized on service startup when no artifacts or categories already exist in the service’s database.

Upload & Storage

Artifacts are uploaded either individual or as an NPM package (preferred).

  • Individually: PUT /api/validation/artifact/:type/:name
  • type = “RESOURCE”
  • name = FHIR resource id
  • As a package: PUT /api/validation/artifact/:type/:name
  • type = “PACKAGE”
  • name = NPM package id

Note: The Admin UI currently only supports uploading a FHIR Bundle of resources. Doing so implies that the Admin UI can only upload individual resources from that Bundle to the validation service. We may consider using NPM packages in the Admin UI and aligning the MeasureEval service and the Validation service to use just NPM packages.

Process Flow

Configuration

The Validation Service supports two types of artifacts that define validation rules:

  1. Package (package.tgz format)
  • A packaged collection of FHIR artifacts (profiles, ValueSets, CodeSystems).
  1. FHIR Resource Artifacts
  • Individual StructureDefinitions, ValueSets, and CodeSystems can be provided.

In addition to artifacts, categories must be initialized/specified in order to have categorized results. Otherwise, all validation results end up being “uncategorized”.

Validation Categories

Validation categories are used to group similar validation issues together, providing a way to manage and prioritize them. Each category defines a set of rules (matchers) that determine which validation issues belong to it.

Category Properties

Each category consists of the following properties:

  • ID: A unique identifier for the category (e.g., Incorrect_display_value_for_code).
  • Title: A human-readable name for the category.
  • Severity: The severity level assigned to issues in this category (ERROR, WARNING, or INFORMATION).
  • Acceptable: A boolean flag indicating if the issues in this category are considered acceptable.
    • Note: In the future, this flag will be used to determine if a report should be submitted (e.g., if all issues are marked as acceptable, the report can still be submitted).
  • Guidance: Instructions or information on how to resolve the issues in this category.
  • Matcher: A rule or set of rules used to match validation issues.

Rule Matching Fields

Rules can be keyed on the following fields of a validation result (OperationOutcome.issue):

  • MESSAGE: Matches against the human-readable description of the issue.
  • SEVERITY: Matches against the FHIR severity of the issue (e.g., error, warning, information).
  • CODE: Matches against the FHIR issue type code (e.g., code-invalid, value).
  • EXPRESSION: Matches against the FHIRPath expression pointing to the element in the resource that caused the issue.

Matcher Types

Validation matchers are used to define the rules for a category. All matchers support an inverted property, which allows for logical negation of the match result.

  • RegexMatcher: The primary matcher used for field-level matching. It evaluates a regular expression against a specific field of a validation issue.
  • InvertibleMatcher: This is the base abstract implementation for other matchers. While not used directly, it provides the inverted property to its subclasses (RegexMatcher and CompositeMatcher), allowing any rule to be negated.
  • CompositeMatcher: A powerful matcher that allows grouping multiple other matchers (including other composite matchers) to create complex matching logic. It uses a requiresAllChildren flag to determine if it should act as a logical AND (true) or logical OR (false).

Detailed Matcher Examples

Below are more detailed examples showing the differences and use cases for each matcher implementation.

1. RegexMatcher

The RegexMatcher is the most straightforward way to match a validation issue. It targets a specific field of the OperationOutcome.issue.

Example: Matching a specific code

{
"field": "CODE",
"regex": "^code-invalid$"
}

2. InvertibleMatcher (Negation)

Since RegexMatcher and CompositeMatcher both extend InvertibleMatcher, they can both be negated using the inverted flag.

Example: Negative Match (Anything except a specific message) This will match any validation issue except those where the message starts with “Success”.

{
"field": "MESSAGE",
"regex": "^Success",
"inverted": true
}

3. CompositeMatcher (Logical Grouping)

The CompositeMatcher is used to combine multiple rules. It can be used as a logical AND or a logical OR.

Example: Logical AND (Matching a specific code AND a specific severity) Matches only if the issue has a code of value AND a severity of error.

{
"requiresAllChildren": true,
"children": [
{
"field": "CODE",
"regex": "^value$"
},
{
"field": "SEVERITY",
"regex": "^error$"
}
]
}

Example: Logical OR (Matching any of multiple fields) Matches if either the message OR the expression contains “Patient”.

{
"requiresAllChildren": false,
"children": [
{
"field": "MESSAGE",
"regex": "Patient"
},
{
"field": "EXPRESSION",
"regex": "Patient"
}
]
}

Example: Complex Nested Logic Matches if the severity is error AND the issue does NOT have a code of informational.

{
"requiresAllChildren": true,
"children": [
{
"field": "SEVERITY",
"regex": "^error$"
},
{
"field": "CODE",
"regex": "^informational$",
"inverted": true
}
]
}

Example Categories

Example 1: Incorrect Display Value (Acceptable)

This category matches issues where a code’s display name is incorrect. It is marked as acceptable: true.

{
"id": "Incorrect_display_value_for_code",
"title": "Incorrect display value for code",
"severity": "WARNING",
"acceptable": true,
"guidance": "The display name for the code does not match the expected value in the terminology server. This is usually a minor issue and does not affect the validity of the data itself.",
"matcher": {
"field": "MESSAGE",
"regex": "^Wrong Display Name '.*' for .* should be .*'.*' .*"
}
}

Example 2: Unable to Match Profile (Not Acceptable)

This category matches issues where a resource’s profile cannot be found. It is marked as acceptable: false.

{
"id": "Unable_to_match_profile",
"title": "Unable to match profile",
"severity": "ERROR",
"acceptable": false,
"guidance": "The resource asserts a profile that is not available in the validation service. Please ensure all required profiles are uploaded as artifacts.",
"matcher": {
"field": "MESSAGE",
"regex": "^Unable to find a match for profile .* among choices:"
}
}

The system reserves a category with an ID of “uncategorized” for any validation issues within a report that are not mapped or associated with a defined category. This ensures that all validation issues are always captured, even when they don’t match any of the configured category rules.

Categories can be configured individually or in bulk via the API.

Example: Profile Validation

When the validation service encounters an Observation resource like the following:

{
"resourceType": "Observation",
"meta": {
"profile": ["http://.../us-core-observation"]
}
// ... other properties
}

It will:

  • Validate the resource against the core FHIR specification.
  • Validate against the http://…/us-core-observation profile.
  • If the profile is missing, generate a warning.

Similarly, for properties bound to a ValueSet or CodeSystem, the service expects these artifacts to be provided. If they are missing, it will issue warnings.

Sequence Diagram

The following diagram illustrates the relationship between the Measure Evaluation Service, Kafka, and the Validation Service:

Loading graph...

Terminology Service Integration and Caching

The Validation Service validates coded elements using either a remote terminology capability or local in-memory supports, depending on how the service is configured.

Terminology resolution options (in order of precedence):

  • External FHIR Terminology server: If configured, the validator directs terminology operations (e.g., $validate-code) to the external server.
  • Link Terminology Service: If enabled, the validator uses Link’s built-in Terminology Service (FHIR endpoint) for terminology operations.
  • Local supports only: If no remote option is configured, the validator falls back to built-in/common supports (e.g., common code systems and in-memory validation support). In this mode, validations that require external ValueSets/CodeSystems may produce warnings (e.g., “Can’t find value set XXX”).

Caching behavior:

  • Remote validate-code calls are wrapped with a cache to reduce repeated network requests.
  • The cache implementation is selectable (e.g., none, in-memory, or distributed) and a TTL can be set to control how long results are retained.
  • The cache key is derived from the tuple (codeSystem, code, display, valueSetUrl), ensuring semantically distinct requests are cached separately.
  • A distributed cache option supports shared caching across instances (e.g., via Redis), while an in-memory option keeps cache local to the process.

Configuration reference:

  • For exact property names, possible values, Redis connection options, and sample YAML, see the Custom Configurations section above.
  • If using the Link Terminology Service, the service automatically targets its FHIR endpoint.
  • If no terminology endpoint is configured, the validator will not make network calls for terminology; only built-in/common supports are used.

Known Deficiencies

  • Global scope: All stored artifacts are always loaded; there is no filtering or scoping based on tenant or package.
  • No tenant-specific configuration: There is no ability to configure validation behavior per tenant or per package version.
  • Use of the HAPI FHIR validation libraries only supports CodeSystem, ValueSet, and StructureDefinition resources.

Future Considerations

  • Operation to bulk retrieve categories and their rules that can be updated in an text editor and then provided back to the bulk save operation.
  • Operation to validate and categorize a resource (or Bundle) and return a composite response of the validation results and associated categories.
  • Operation to re-validate and re-categorize a given report, to update the persisted set of results and categories for the report.

Database Schema

4 properties

Persistence schema for the Validation Service (Azure Blob Storage)

ValidationResultsarray[object]

Storage for validation outcome reports