The form validator uses QC rules defined as JSON or YAML data objects to check data, based on Cerberus. In a nutshell:
dict) will be evaluated against the validation schemaExample:
| YAML Rule Definition | JSON Rule Definition | When Validating |
|---|---|---|
|
|
|
Check the full list of built-in Cerberus rules here.
Keywords frequently used in UDS rules are described in the table below:
| Keyword | Description | Example |
|---|---|---|
allowed |
Specify the list of allowed values. Validation will fail if any other value is given in the data record. |
|
forbidden |
Specify the list of forbidden values. Validation will fail if values in this list are included in the record. |
|
min, max |
Minimum and maximum value allowed (only applicable to object types which support comparison operations like integers or floats). Each keyword can be used independently. Use together to define a range. |
|
nullable |
If set to "true", the field value is allowed to be empty. This rule will be checked on every field, regardless of if it's defined or not. The rule's constraints defaults it to "false". In other words, if neither `nullable` nor `required` are set, the field is required. |
|
required |
If set to "true", the field is mandatory. Validation will fail when it is missing. |
|
type |
Data type allowed for the value. See the Cerberus documentation for the list of type names. If multiple types are allowed, you can specify the types as a list. |
|
anyof |
Allows to define different sets of rules to validate against, supplied in a list of dicts. Field will be considered valid if any of the provided constraints validates the field. |
|
regex |
Regex to validate against; only valid for string values. |
|
These are defined in the NACCValidator class.
Used to validate the field based on comparison with another field, with optional adjustments.
comparator: The comparison expression; can be one of [">", "<", ">=", "<=", "==", "!="]base: The field or value to compare toadjustment: The adjustment to make to the base expression, if any. If specified, op must also be providedop: The operation to make the adjustment for; can be one of ["+", "-", "*", "/", "abs"]. If specified, adjustment must also be provided
field {comparator} base {op} adjustment, e.g. field <= base + adjustmentabs, it instead follows the formula abs(field - base) {comparator} adjustment, e.g. abs(field - base) <= adjustmentprevious_record: Optional boolean - if True, will search for base in the previous record and make the comparison against thatinitial_record: Optional boolean - if True, will search for base in the initial record and make the comparison against that
initial_record and previous_record are set to Trueignore_empty: Optional boolean - if comparing to previous record(s), set this to True to ignore records where the specified base is empty
initial_record, error will be thrown if initial_record True and ignore_empty fiels are providedThe value to compare to (base) can be another field in the schema OR a special keywords either related to the current date (i.e. the exact time/date at time of validation) or a previous record
current_date: Compare to the current datecurrent_year: Compare to the current yearcurrent_month: Compare to the current monthcurrent_day: Compare to the current dayThe rule definition for compare_with should follow the following format:
{
"field_name": {
"compare_with": {
"comparator": "comparator, one of >, <, >=, <=, ==, !=",
"base": "field or value to compare field_name to",
"adjustment": "(optional) the adjustment field or value",
"op": "(optional) operation, one of +, -, *, /, abs",
"previous_record": "(optional) boolean, whether or not to compare to base in the previous record",
"initial_record": "(optional) boolean, whether or not to compare to base in the initial record",
"ignore_empty": "(optional) boolean, whether or not to ignore previous records where this field is empty"
}
}
Example:
birthyr <= current_year - 15, e.g. birthyr must be at least 15 years prior to the current year.
| YAML Rule Definition | JSON Rule Definition | When Validating |
|---|---|---|
|
|
|
Absolute Value Example
abs(waist1 - waist2) <= 0.5, e.g. the difference between waist1 and waist2 cannot be more than 0.5
| YAML Rule Definition | JSON Rule Definition | When Validating |
|---|---|---|
|
|
|
Used to compare ages. Takes the following parameters:
comparator: The comparison expression; can be one of [">", "<", ">=", "<=", "==", "!="]birth_year: The birth year (or year the age should start)birth_month: The birth month (or month the age should start) - optional, defaults to 1 (first month yearbirth_day: The birth day (or day the age should start) - optional, defaults to 1 (first day of year)compare_to: Variable name, integer, or list of variable names/integers to compare the field’s age toAge is calculated by combining and converting the birth_* fields into a date, and then calculating the age at the field (assuming the field’s value is also a valid date). The exact calculation is age = (field_value - birth_date).days / 365.25. This result must then satisfy the comparison formula against the compare_to field, e.g. age <= compare_to.
Additional things to note:
date fields or string fields with formatting: datebirth_year is required, but specifying the month and date allows the comparison to be more fine-grainedcompare_with (current_date, current_year, etc.)The rule definition for compare_age should follow the following format:
{
"field_name": {
"compare_age": {
"comparator": "comparator, one of >, <, >=, <=, ==, !=",
"birth_year": "the birth year (or year the age should start)",
"birth_month": "the birth month (or month the age should start) - optional, defaults to 1 (first month year)",
"birth_day": "the birth day (or day the age should start) - optional, defaults to 1 (first day of year)",
"compare_to": "variable name, integer, or list of variable names/integers to compare to"
}
}
}
Example:
age at frmdate >= behage
| YAML Rule Definition | JSON Rule Definition | When Validating |
|---|---|---|
|
|
|
Used to specify the list of compatibility (if-then-else) constraints for a given field with other fields within the form or across multiple forms. A field will only pass validation if none of the compatibility constraints are violated.
Each constraint specifies if, then, and (optionally) else attributes to allow the validation of a set of fields/subschemas based on the outcome of other fields/subschemas (i.e. when the schema(s) defined under if evaluates to true for a given record, then the schema(s) specified under then will be evaluated).
Each if/then/else attribute can have several fields which need to be satisifed, with the *_op attribute specifying the boolean operation in which to compare the different fields. For example, if if_op = or, then as long as any of the fields satsify their schema, the then attribute will be evaluated. The default *_op is and.
The rule definition for compatibility follows the following format:
{
"<field_name>": {
"compatibility": [
{
"if": {
"<field_name>": "subschema to be satisifed"
},
"then": {
"<field_name>": "subschema to be satisifed"
}
},
{
"if_op": "and",
"then_op": "or",
"else_op": "or",
"if": {
"<field_name>": "subschema to be satisifed",
"<field_name>": "subschema to be satisifed"
},
"then": {
"<field_name>": "subschema to be satisifed",
"<field_name>": "subschema to be satisifed"
},
"else": {
"<field_name>": "subschema to be satisifed",
"<field_name>": "subschema to be satisifed"
}
}
]
}
}
One additional nuance is the evaluation against None/null values. Because Cerberus always evaluates "nullable": False by default, the application of a subschema in this case must explicitly set "nullable": True if the attributes evaluate or result in null values. For example
# if case: If PARENTVAR is blank or 88, then VAR1 must be blank
"if": {
"parentvar": {
"nullable": True, # <--- external nullable flag for the if clause
"anyof": [
{
"nullable": True,
"filled": False
},
{
"allowed": [88]
}
]
}
},
"then": {
"var1": {
"nullable": True,
"filled": False}
}
# then case: if PARENTVAR is blank, then the following must be blank: var1, var2, var3
"if": {
"parentvar": {
"nullable": True,
"filled": False
}
},
"then": {
"var1": {
"nullable": True, # <--- nullable flag for the then clause
"logic": {
"formula": {
"and": [
{"==": [None, {"var": "var1"}]},
{"==": [None, {"var": "var2"}]},
{"==": [None, {"var": "var3"}]}
]
}
}
}
}
Examples:
If field incntmod (primary contact mode with participant) is 6, then field incntmdx (specify primary contact mode with participant) cannot be blank.
| YAML Rule Definition | JSON Rule Definition | When Validating |
|---|---|---|
|
|
|
This rule can also be used to define the “if not, then” case. For this, we use forbidden instead of allowed.
So if field incntmod (primary contact mode with participant) is NOT 6, then field incntmdx (specify primary contact mode with participant) must be blank.
| YAML Rule Definition | JSON Rule Definition | When Validating |
|---|---|---|
|
|
|
Used to specify a mathematical formula/expression to validate against, and utilizes json-logic-py (saved as json_logic.py). This rule overlaps with compare_with, but allows for comparison between multiple fields as well as more complex, nested mathematical expressions. That being said, it does not account for the same special keywords like current_year.
formula: The mathematical formula/expression to apply; see the operations dict in json_logic.py to see the full list of available operators. Each operator expects differently formatted argumentserrormsg: A custom message to supply if validation fails. This key is optional; if not provided the error message will simply be value {value} does not satisify the specified formulaThe rule definition for logic should follow the following format:
{
"<field_name>": {
"logic": {
"formula": {
"operator": "list of arguments for the operator"
},
"errormsg": "<optional error message to supply if validation fails>"
}
}
Example:
One of var1, var2, or var3 must be 1.
| YAML Rule Definition | JSON Rule Definition | When Validating |
|---|---|---|
|
|
|
The validator also has custom operators in addition to the ones provided by json-logic-py:
| Operator | Arguments | Description |
|---|---|---|
count |
[var1, var2, var3...] |
Counts how many valid variables are in the list, ignoring null and 0 values |
count_exact |
[base, var1, var2, var3, ...] |
Counts how many values in the list equal the base. The first value is always considered the base, and the rest of the list is compared to it, so this operator requires at least 2 items. |
Used to specify the list of checks to be performed against the previous visit for the same participant.
Each constraint specifies previous and current attributes. If conditions specified under the previous subschema are satisfied by the previous visit record, then the current visit record must satisfy the conditions specified under current subschema.
Each constraint also has optional fields that can be set:
previous/current attribute can have several fields which need to be satisifed, so an optional *_op attribute can be used to specify the boolean operation in which to compare the different fields. For example, if prev_op = or, then as long as any of the fields satsify their schema, the current attribute will be evaluated. The default *_op is and.initial_record: If set to True, will grab the initial record (not necessarily the previous one)ignore_empty: Takes a string or list of strings denoting fields that cannot be empty
initial_record, error will be thrown if initial_record True and ignore_empty fields are providedswap_order: If set to True, it will swap the order of operations, evluating the current subschema first, then the previous subschemaNOTE: To validate
temporalrules, the validator should have aDatastoreinstance which will be used to retrieve the previous visit record(s) for the participant.
The rule definition for temporalrules should follow the following format:
{
"<parent_field_name>": {
"temporalrules": [
{
"previous": {
"<field_name>": "subschema to be satisfied for the previous record"
},
"current": {
"<field_name>": "subschema to be satisfied for the current record, will be evaluated if the previous subschema is satisfied"
}
}, {
"ignore_empty": ["<field_name_1>", "<field_name_2>"],
"prev_op": "or",
"previous": {
"<field_name_1>": "subschema to be satisfied for the previous record",
"<field_name_2>": "subschema to be satisfied for the previous record"
},
"current": {
"<field_name>": "subschema to be satisfied for the current record, will be evaluated if either of the previous subschemas are satisfied"
}
},
{
"swap_order": true,
"previous": {
"<field_name>": "subschema to be satisfied for the previous record, will be evaluated if the current subschema is satisfied"
},
"current": {
"<field_name>": "subschema to be satisfied for the current record"
}
},
{
"initial_record": true,
"previous": {
"<field_name>": "subschema to be satisfied for the initial record"
},
"current": {
"<field_name>": "subschema to be satisfied for the current record, will be evaluated in the previous (representing initial) subschema is satisfied"
}
}
]
}
}
Example:
If field taxes (difficulty with taxes, business, and other papers) is 0 (normal) at a previous visit, then taxes cannot be 8 (not applicable/never did) at the follow-up visit.
| YAML Rule Definition | JSON Rule Definition | When Validating |
|---|---|---|
|
|
|
Used to check whether a specified ADCID is valid.
This validation is implemented using the function rule with custom check_adcid function in the NACCValidator. The rule definition should be in the following format:
{
"<adcid_variable>": {
"function": {
"name": "check_adcid",
"args": {"own": "<bool, whether to validate against own ADCID or list of current ADCIDs; defaults to True>"}
}
}
}
NOTE: To validate
check_adcid, the validator should have aDatastoreinstance which implements theis_valid_adcidfunction (which should have access to center’s ADCID and the list of current ADCIDs).
Example:
The adcid must match the center’s own ADCID, whereas oldadcid should be a valid ADCID in the current ADCIDs list.
| YAML Rule Definition | JSON Rule Definition | When Validating |
|---|---|---|
|
|
|
Custom rule defined to validate the Geriatric Depression Scale (GDS) score computation. Only be used for validating the gds field in UDS Form B6.
The rule definition for compute_gds should follow the following format:
{
"gds": {
"compute_gds": ["list of fields used in GDS score computation"]
}
}
Custom rule defined to check whether a given Drug ID is valid RXCUI code.
This function uses the check_with rule from Cerberus. The rule definition should be in the following format:
{
"<rxnormid variable>": {
"check_with": "rxnorm"
}
}
NOTE: To validate
rxnorm, the validator should have aDatastoreinstance which implements theis_valid_rxcuifunction which will check if the given rxnormid value is a valid RXCUI code
Custom rule that scores the number of correct or incorrect variables based on a mode and scoring key.
This validation is implemented using the function rule with custom score_variables function in the NACCValidator. The rule definition should be in the following format:
{
"<score_variable>": {
"function": {
"name": "score_variables",
"args": {
"mode": "'correct' or 'incorrect'",
"scoring_key": {
"var1": "correct_value_1",
"var2": "correct_value_2",
"...etc"
},
"logic": {
"...same as logic formula"
},
"calc_var_name": "(optional) name of the variable to store calculation under; defaults to __total_sum"
}
}
}
}
mode: Either correct or incorrect; if correct, will count the number of correct variables, and if incorrect will count the number of incorrect variablesscoring_key: Dict representing the scoring key; maps each variable involved in the scoring function to its correct valuelogic: Logic to perform on the calc_var_name once calculated; same schema as the logic rule defined earliercalc_var_name: (Optional) Name of the variable to store the calculation under; defaults to __total_sum. This variable MUST be unique to your record or else the validator will throw an errorThis function looks at all variables defined in the scoring_key and counts the number that are correct if the mode is correct or incorrect if the mode is incorrect. It stores this result in the variable defined by calc_var_name (defaults to __total_sum, note the double underscore for uniqueness) that can then be used inside the logic formula to compare against.
If any of the fields in scoring_key are missing or invalid, then validation is skipped and this rule “passes” by default. Otherwise validation succeeds if calc_var_name satisfies the given formula, else it fails.
Example:
total must match the total number of correct variables defined under scoring_key.
| YAML Rule Definition | JSON Rule Definition | When Validating |
|---|---|---|
|
|
|