FHIRflat specification#
The FHIRflat structure closely follows that of FHIR, and simply flattens nested
columns in a manner similar to pd.json_normalize(). Some fields are excluded
either because they are simply used for convenience within a FHIR server,
because they contain information not relevant within ISARIC clinical data, or
because they would contain Personally identifiable information (PII). These
fields can be accessed and edited for each resource using the flat_exclusions
property. There are a few specifics to FHIRflat that differ from simply
normalising a FHIR structure, noted below.
codeableConcepts
CodeableConcepts are converted into 2 lists, one of codes and one of the corresponding text. The coding is compressed into a single string with the format
system|code. The ‘|’ symbol was chosen as it is the standard way to query codes in FHIR servers (example). Thus a JSON snippet containing a codebleConcept:"code": { "coding": [ [ { "system": "http://loinc.org", "code": "3141-9", "display": "Body weight Measured", }, { "system": "http://snomed.info/sct", "code": "27113001", "display": "Body weight", }, ] ] }
is coded as two fields
code.code
code.text
[”http://loinc.org|3141-9”, “http://snomed.info/sct|27113001”]
[“Body weight Measured”, “Body weight”]
Note that the external
codinglabel is removed.References
Reference are a string with the name of the resource with the ID, separated by a forward slash.
"subject": { "reference": "Patient/f001", "display": "Donald Duck" }
becomes
subject.reference
“Patient/f001”
The display text will not be converted due to the risk of revealing identifying information (e.g., a patient’s name).
Extensions
The base FHIR schema can be extended to meet the needs of individual implementations using extension fields. FHIRflat displays these with the extension
urlas part of the column name. For example"extension": [ { "url": "timingPhase", "valueCodeableConcept": { "coding": [ { "system": "http://snomed.info/sct", "code": 278307001, "display": "on admission", } ] }, }, { "url": "relativePeriod", "extension": [ {"url": "relativeStart", "valueInteger": 2}, {"url": "relativeEnd", "valueInteger": 5}, ], }, ]
becomes
extension.timingPhase.code
extension.timingPhase.text
extension.relativePeriod.relativeStart
extension.relativePeriod.relativeEnd
“on admission”
2
5
Complex (nested) extensions such as relativePeriod also omit the internal
extensionlabels.0..* cardinality fields
Fields which can contain an unspecified number of duplicate entries are dealt with according to the number of entries present. lists of length == 1 are expanded out as above, while any longer lists are kept in a single column with the data in it’s original nested structure and
_denseappended to the end of the field name. These fields are not expected to be queried regularly in standard analyses.For example, the
diagnosisfield of the Encounter resource has 0..* cardinality. If a single diagnosis is present, the field is expanded out:"diagnosis": [ { "condition": [{"reference": {"reference": "Condition/stroke"}}], "use": [ { "coding": [ { "system": "http://terminology.hl7.org/CodeSystem/diagnosis-role", "code": "AD", "display": "Admission diagnosis", } ] } ], } ]
becomes
diagnosis.condition.reference
diagnosis.use.code
diagnosis.use.text
Condition/stroke
Admission diagnosis
whereas if 2 different diagnoses are present
"diagnosis": [ { "condition": [{"reference": {"reference": "Condition/stroke"}}], "use": [ { "coding": [ { "system": "http://terminology.hl7.org/CodeSystem/diagnosis-role", "code": "AD", "display": "Admission diagnosis", } ] } ], }, { "condition": [{"reference": {"reference": "Condition/f201"}}], "use": [ { "coding": [ { "system": "http://terminology.hl7.org/CodeSystem/diagnosis-role", "code": "DD", "display": "Discharge diagnosis", } ] } ], }, ]
becomes
encounter.diagnosis_dense
“[{“condition”: [{“reference”…}]}]”