This gear reads a JSON form data file and uses a form- and study-specific set of QC rules to check the values.
- QC rule definitions are organized by form and stored in a S3 bucket.
- For optional forms such as (A2, B5, B7, etc.), there are two rule definition files: one definition to be evaluated when the form is submitted and the other definition to be evaluated when the form is skipped.
- Depending on whether the form is submitted or not for a particular visit (inferred using the header variable MODExx for the particular form), this gear loads the appropriate definition file from S3 bucket and constructs the validation schema for that visit.
- Validation schema and input data record is passed to the NACC Form Validator library for rule evaluation.
Environment
This gear uses the AWS SSM parameter store, and expects that AWS credentials are available in environment variables (AWS_SECRET_ACCESS_KEY, AWS_ACCESS_KEY_ID, AWS_DEFAULT_REGION) within the Flywheel runtime.
Running
Ideally, qc checks for a participant should be triggered by Form QC Coordinator. Prevent from manually running this gear on individual visit files.
- form_data_file: The form data file (JSON or CSV) to validate, this is required.
- form_configs_file: A JSON file with forms module configurations, this is required.
- supplement_data_file: Optional input file for supplement module.
Configs
Gear configs are defined in manifest.json.
Error Report
- For each validation error, gear generates and stores error metadata in Flywheel.
- Errors generated by the validation library are mapped with NACC data quality check codes for easy reference.
- The error report for each project is available in Flywheel Issue Manager UI.
- Error report can be downloaded as a CSV file from the UI or programmatically pulled using a Flywheel SDK script. An example script can be found here.
After processing, the gear updates the input file with the following metadata. See the QC Conventions reference for details on the data models and conventions used.
- QC Result: A validation QC result is added to the file’s
file.info.qc metadata with:
name: "validation"
state: "PASS" or "FAIL" depending on whether QC checks passed
data: List of FileError objects with error details (codes, messages, locations) if any validation errors occurred
-
Validation Timestamp: The file’s .info metadata is updated with a validated_timestamp field set to the current UTC time. This is used by the form-qc-coordinator to determine whether a file needs re-validation.
- Gear Tags: The file is tagged using the
GearTags mechanism with status-specific tags:
- Adds
form-qc-checker-PASS or form-qc-checker-FAIL based on QC outcome
- Previous status tags for this gear are removed before adding the new one