Navigation: ← Back to README
- Summary
- Current Implementation Status (2026-03-10)
- Export Mode Design
- Target User Flow
- Syncable File Model
- Export Controls
- REDCap Built-In De-Identification Parameters
- De-Identification And Encryption
- Metadata Outputs
- Architecture In rdm-integration
- Step-By-Step Implementation Plan
- Testing Plan
- Open Questions
- References
This document describes the redcap2 plugin, which coexists with the current redcap plugin.
- Keep current
redcapunchanged (File Repository mode). - Add
redcap2for direct API exports (without manual "export then save to File Repository"). - Start with a report-first workflow, then expand to more advanced export/de-identification/metadata features.
Key point: manual export/save was required in the old redcap plugin because it uses REDCap fileRepository list/export actions (folder_id / doc_id flow).
PoC branch: The proof-of-concept is being developed on the redcap_v2 branch (same branch name in both the backend and frontend repositories).
↑ Back to Top | → Current Implementation Status
- New backend plugin
redcap2was added and registered. redcap2supports two export modes selectable in the UI:- Report mode (
exportMode: "report"): exports a saved REDCap report by ID viacontent=report. - Records mode (
exportMode: "records"): exports all project records viacontent=recordwith optional filters.
- Report mode (
redcap2supports a variable-list mode for the intermediate settings screen (pluginOptions.request = "variables"):- In report mode: fetches only the CSV header row of the report (header-only request, avoids full download).
- In records mode: fetches the full field list from
content=metadata.
redcap2Query()andStreams()generate syncable virtual files directly from REDCap API exports (no file repository dependency).- Frontend intermediate page (
/redcap2-export/:id) with:- Report / All records toggle.
- Report ID field (report mode only).
- Common export controls: format, record type, CSV delimiter, raw/label, header labels.
- Record-only filters: fields, forms, events, records, filter logic, date range (records mode only).
- "Include survey fields" and "Include Data Access Groups" toggles (records mode only, default off).
- Variable anonymization table with auto-detection of REDCap identifier-tagged fields.
- End-to-end
pluginOptionspayload propagated through options, compare, and stream/store requests. - Export parameter routing is correct per mode:
applySharedExportParams:type,csvDelimiter,rawOrLabel,rawOrLabelHeaders— sent for both modes.applyRecordOnlyFilters:fields,forms,events,records,filterLogic,dateRangeBegin,dateRangeEnd,exportSurveyFields,exportDataAccessGroups— sent for records mode only (these are not supported bycontent=report).
- Bundle cache keyed by
exportMode+ all stable options (includingexportSurveyFields,exportDataAccessGroups);generatedAtexcluded. - REDCap built-in de-identification support:
exportSurveyFieldsandexportDataAccessGroupsexposed as records-mode toggles (server-side suppression).- Identifier-tagged fields auto-detected from
content=metadata(identifiercolumn) and pre-selected asblankin the variable anonymization table; users can override tonone.
- Existing
redcapplugin remains available and unchanged for fallback.
Report mode (exportMode: "report"):
redcap/report-<id>/data.csvordata.jsonredcap/report-<id>/metadata.csv(filtered to exported fields)redcap/report-<id>/project_info.jsonredcap/report-<id>/events.csv(longitudinal projects)redcap/report-<id>/form_event_mapping.csv(longitudinal projects)redcap/report-<id>/manifest.json(export config + timestamp + REDCap version + warnings)
Records mode (exportMode: "records"):
redcap/records/data.csvordata.jsonredcap/records/metadata.csv(filtered to exported fields)redcap/records/project_info.jsonredcap/records/events.csv(longitudinal projects)redcap/records/form_event_mapping.csv(longitudinal projects)redcap/records/manifest.json
- XML data export.
- Advanced de-identification modes beyond
blank(drop/mask/pseudonymize/encrypt). - DDI-CDI/Croissant/RO-Crate metadata exporters.
- Attachment/file-field download modes.
↑ Back to Top | → Export Mode Design
Both modes are now implemented as peer citizens in the same UI and backend.
- Exports a saved REDCap report by ID via
content=report. - The report definition in the REDCap UI controls which fields, records, and filters are included — no extra filter parameters are sent by the plugin.
- User enters the report ID manually (the standard REDCap API has no endpoint to list reports; IDs are visible in "My Reports & Exports" in the REDCap web UI).
- Variable list for anonymization is fetched by a CSV header-only request against the report endpoint (avoids downloading full data just to get field names). Falls back to
content=metadataif that fails. - Best choice when: the user has already curated a report in REDCap and wants to export exactly that snapshot.
- Exports directly via
content=recordwith optional server-side filters. - No report ID needed — works on any project without prior report setup.
- Supports all REDCap record-export filter parameters:
fields,forms,events,records,filterLogic,dateRangeBegin,dateRangeEnd. - Variable list for anonymization is fetched from
content=metadata(all project fields). - Best choice when: the user wants an ad-hoc export with dynamic filters, or no report has been configured.
The content=report endpoint does not accept record-filter parameters.
The split into applySharedExportParams and applyRecordOnlyFilters enforces this:
| Parameter | Report mode | Records mode |
|---|---|---|
type (flat/eav) |
✓ | ✓ |
csvDelimiter |
✓ | ✓ |
rawOrLabel |
✓ | ✓ |
rawOrLabelHeaders |
✓ | ✓ |
fields |
— | ✓ |
forms |
— | ✓ |
events |
— | ✓ |
records |
— | ✓ |
filterLogic |
— | ✓ |
dateRangeBegin / dateRangeEnd |
— | ✓ |
exportSurveyFields |
— | ✓ |
exportDataAccessGroups |
— | ✓ |
report_id |
required | — |
↑ Back to Top | → Target User Flow
- User selects
REDCap Reports (beta)source plugin. - User enters REDCap URL and API token.
- On the intermediate export settings page:
- Select Report mode (default).
- Enter Report ID (find it in REDCap under "My Reports & Exports").
- Choose format (
csv/json), record type, delimiter, raw/label options. - Optionally configure per-variable anonymization (
none/blank).
- Compare step shows generated virtual files under
redcap/report-<id>/. - User selects files and syncs to Dataverse.
- User selects
REDCap Reports (beta)source plugin. - User enters REDCap URL and API token.
- On the intermediate export settings page:
- Select All records mode.
- Choose format, record type, delimiter, raw/label options.
- Optionally set fields, forms, events, records, filter logic, date range.
- Optionally configure per-variable anonymization.
- Compare step shows generated virtual files under
redcap/records/. - User selects files and syncs to Dataverse.
↑ Back to Top | → Syncable File Model
redcap2 exposes generated virtual files through Query() and Streams().
File paths per mode:
Report mode:
redcap/report-<id>/data.csvordata.jsonredcap/report-<id>/metadata.csvredcap/report-<id>/project_info.jsonredcap/report-<id>/events.csv(longitudinal only)redcap/report-<id>/form_event_mapping.csv(longitudinal only)redcap/report-<id>/manifest.json
Records mode:
redcap/records/data.csvordata.jsonredcap/records/metadata.csvredcap/records/project_info.jsonredcap/records/events.csv(longitudinal only)redcap/records/form_event_mapping.csv(longitudinal only)redcap/records/manifest.json
Planned naming extensions (later):
- Additional metadata sidecars for standards exporters (DDI-CDI, Croissant, RO-Crate).
Design requirements:
- Deterministic path/ID based on mode + options.
- Stable hashing for change detection.
- Each generated file can be independently selected in the tree.
↑ Back to Top | → Export Controls
exportMode:reportorrecordsdataFormat:csvorjsonrecordType:flatoreavcsvDelimiter: comma or tabrawOrLabel:raw,label, orbothrawOrLabelHeaders:raworlabelvariables[]with anonymization mode:noneorblank
reportId(required — entered manually; REDCap API has no report-listing endpoint)
fieldsformseventsrecordsfilterLogicdateRangeBegindateRangeEndexportSurveyFields: include survey identifier and timestamp fields (defaultfalse)exportDataAccessGroups: include Data Access Group field (defaultfalse)
- XML output support
include_attachments: defaultfalseattachments_mode:reference-onlyordownloadattachments_max_size_mb
Rationale:
- For many projects, upload/file fields should remain references in MVP.
- Full attachment download can be expensive and should be explicit.
↑ Back to Top | → REDCap Built-In De-Identification Parameters
The REDCap record-export API (content=record) natively supports several de-identification parameters that can be applied server-side before data leaves REDCap. This section analyzes these parameters and how they relate to the manual per-variable anonymization we currently implement client-side.
The content=record endpoint accepts these de-identification-related parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
exportSurveyFields |
boolean | false |
Include survey-specific fields (redcap_survey_identifier, [instrument]_timestamp). Set to false to strip them. |
exportDataAccessGroups |
boolean | false |
Include the redcap_data_access_group field. Set to false to strip it. |
exportCheckboxLabel |
boolean | false |
Export checkbox labels instead of raw values (relevant for label-based anonymization). |
filterLogic |
string | — | Server-side record filtering. Already implemented. Can exclude records containing sensitive values. |
Additionally, REDCap Data Dictionaries allow project admins to tag fields with Identifier status (identifier = y). While this designation is visible in the content=metadata export (the identifier column), there is no API parameter to automatically strip all identifier-tagged fields from a record export. That logic must be implemented client-side by the exporting tool.
The content=report endpoint does not accept any of the de-identification parameters above. Reports are exported as configured in the REDCap UI. However, when creating or editing a report in the REDCap web interface, the user can choose:
- "Remove all tagged Identifier fields" — the report definition itself excludes fields marked as identifiers.
- "Hash the Record ID" — the report replaces the record ID with a hashed value.
- "Remove all free-text fields" — strips notes/text fields.
- "Remove dates and shift to date" — date-shifts or removes date fields.
These options are set in the REDCap UI when creating the report and take effect before the API returns data. They are not settable via the API at export time.
| Capability | REDCap built-in (server-side) | Our current approach (client-side) |
|---|---|---|
| Suppress survey identifier fields | exportSurveyFields=false (records mode) |
Implemented — toggle on settings page (records mode) |
| Suppress Data Access Groups | exportDataAccessGroups=false (records mode) |
Implemented — toggle on settings page (records mode) |
| Strip identifier-tagged fields | Not available as API parameter; only in report definitions | Implemented — auto-detected from metadata, pre-selected as blank |
| Hash record ID | Report-level setting in REDCap UI only | Not yet implemented |
| Blank/drop arbitrary fields | Not available | variables[].anonymization = "blank" per field |
| Remove free-text fields | Report-level setting in REDCap UI only | Not available (would need new mode) |
| Date-shift dates | Report-level setting in REDCap UI only | Not available |
| Exclude specific fields | fields param (records mode — positive filter) |
variables[].anonymization = "blank" per field |
| Server-side record filter | filterLogic (records mode) |
Already implemented |
-
ExposeDone. Implemented as "Include survey fields" and "Include Data Access Groups" checkboxes on the records-mode settings page. Default isexportSurveyFieldsandexportDataAccessGroupsas toggles in records mode.false(off). Backend sends the parameters inapplyRecordOnlyFiltersonly when the user opts in. -
Auto-detect identifier-tagged fields from metadata.Done. Backend parses theidentifiercolumn fromcontent=metadataCSV and returnsSelected: trueon those fields in the variable-list response. Frontend pre-selects those variables asblankin the anonymization table. Users can override tonone. -
For report mode, document that de-identification is best done in the REDCap report definition itself. Since the report API has no de-identification parameters, users should be advised to enable "Remove all tagged Identifier fields", "Hash the Record ID", etc. when creating the report in REDCap. The manifest should record whether the report was configured for de-identification (this info is not available from the API, so it should be a user attestation or checkbox in the UI).
-
Do not try to replicate date-shifting or record ID hashing client-side in the near term. REDCap's date-shifting uses project-level offsets that are not exposed via the API. Reimplementing this would be complex and fragile. If date-shifting is needed, users should use a report with date-shifting enabled, or use records mode and apply a post-processing step.
-
Keep the manual per-variable
blankmode as the primary client-side tool for both modes. It is more flexible than anything REDCap offers at the API level and complements the built-in parameters well. The planneddrop/mask/pseudonymizeextensions remain valuable for cases that built-in parameters cannot cover.
exportSurveyFields and exportDataAccessGroups backend wiring:
- Two fields added to
pluginOptions:ExportSurveyFields boolandExportDataAccessGroups bool. - Sent in
applyRecordOnlyFilters(records-mode only) when the user opts in. - Included in
bundleCacheKeyfor correct cache separation. - Two checkboxes on the frontend settings page (records mode only, defaults off).
Identifier auto-detection wiring:
identifierFieldsFromMetadata()parses theidentifiercolumn from thecontent=metadataCSV.listVariablesFromMetadata()andlistVariablesFromReport()returnSelectItementries withSelected: truefor identifier-tagged fields.- Frontend reads the
selectedflag and pre-sets those variables' anonymization toblank(user can override tonone).
Both changes are backward-compatible with the existing payload structure.
↑ Back to Top | → De-Identification And Encryption
De-identification should be policy-driven, not ad-hoc. The built-in REDCap parameters described above should be used as the first layer (server-side stripping), with our policy model applied as a second layer (client-side transforms).
Suggested policy file (redcap2-policy.json):
drop_fields: remove columns entirelyblank_fields: keep column but replace all values with empty valuesmask_rules: regex or function-based transformspseudonymize_fields: deterministic irreversible tokenizationencrypt_fields: reversible encryption
- Server-side suppression (NEW — via built-in REDCap parameters)
exportSurveyFields=false: suppress survey identifier and timestamp fieldsexportDataAccessGroups=false: suppress data access group field- safest option — data never leaves REDCap
- Drop
- safest client-side option for direct identifiers
- Blank
- preserves schema, no values
- can be auto-applied to REDCap identifier-tagged fields
- Deterministic pseudonymization (non-reversible)
- e.g. HMAC-based token with secret key
- consistent per value, not reversible
- Reversible encryption
- only if strictly required
- requires key management, key rotation, audit policy, and strict access controls
Important:
- "Anonymized and reversible" is not anonymous in strict privacy sense.
- If reversibility is needed, call it pseudonymization/encryption and treat it as sensitive.
- Use server-side suppression (
exportSurveyFields=false,exportDataAccessGroups=false) as the baseline. - Auto-blank REDCap identifier-tagged fields (detected from metadata) by default; allow user override.
- Default to
blankordropfor any remaining known identifiers. - Make reversible encryption opt-in and disabled by default.
- Store no raw keys in job payloads or logs.
↑ Back to Top | → Metadata Outputs
Requested targets:
- DDI-CDI
- Croissant (including CDIF profile compatibility target)
- RO-Crate
Use one internal normalized metadata model, then fan out to exporters.
Normalized model should include:
- project-level metadata
- table/file-level metadata
- variable-level metadata
- code lists/value labels
- provenance (source report/mode, options, timestamp)
Then:
- emit
*.jsonldfor DDI-CDI - emit
croissant.json(or JSON-LD form as needed by tooling) - emit
ro-crate-metadata.json
Option A:
- Generate CSV + metadata sidecars in
redcap2 - Use existing DDI-CDI generation pipeline on resulting tabular files
Option B:
- Add a direct REDCap->DDI-CDI generator path
- Reuse helper code from existing
ddi-cdicomponents where practical
MVP recommendation: Option A.
↑ Back to Top | → Architecture In rdm-integration
image/app/plugin/impl/redcap2/common.goimage/app/plugin/impl/redcap2/options.goimage/app/plugin/impl/redcap2/query.goimage/app/plugin/impl/redcap2/streams.goimage/app/plugin/registry.gowithredcap2image/app/frontend/default_frontend_config.jsonaddredcap2entryconf/frontend_config.jsonaddredcap2entry- plugin request structs now include
pluginOptions:OptionsRequestCompareRequestStreamParams
Planned backend extensions (not yet implemented):
image/app/plugin/impl/redcap2/metadata.goimage/app/plugin/impl/redcap2/deidentify.goimage/app/plugin/impl/redcap2/exporters/(DDI-CDI/Croissant/RO-Crate)
Implemented frontend redcap2 UX:
- Intermediate export settings page (
/redcap2-export/:id). - Report / All records mode toggle (report mode is default).
- Report ID field (visible in report mode only).
- Common export controls: format, record type, delimiter, raw/label, header labels.
- Record-only filter fields: fields, forms, events, records, filter logic, date range (visible in records mode only).
- "Include survey fields" and "Include Data Access Groups" toggles (records mode only, default off).
- Variable anonymization table (
none/blank) with auto-detection of REDCap identifier-tagged fields (pre-selected asblank). - Generated files preview updates to show mode-appropriate paths.
pluginOptionspayload propagated through options, compare, and stream/store requests.
Planned frontend extensions:
- Richer de-identification config panel.
- Metadata format toggles and generators.
Constraint:
Current generic request model is string-heavy (option, repoName, etc.). pluginOptions is now used for structured redcap2 settings and should remain the extension mechanism.
↑ Back to Top | → Step-By-Step Implementation Plan
Confirm whether report listing endpoint exists on target REDCap instance.Confirmed: standard REDCap API has no report-listing endpoint; report ID is entered manually.- Confirm minimum REDCap version and API rights assumptions.
- Lock MVP scope:
- report mode + records mode (both implemented)
- csv/json initial scope
- report-sidecar generation
- no attachment download
- no reversible encryption in MVP
- Scaffold
redcap2plugin package. - Implement API client helpers for report export + metadata export.
- Implement
Query()to create virtual nodes for generated files. - Implement
Streams()to generate bytes on demand. - Implement deterministic hashes in
Query()for generated files. - Add logging, error handling, and timeout strategy for long exports.
- Register plugin in
registry.go.
- Add
redcap2entry to frontend config. - Add required fields and intermediate settings page:
- URL
- token
- report ID (text input on export page)
- export controls (including rawOrLabel, rawOrLabelHeaders)
- variable anonymization
- Pass settings into compare/stream requests.
- Verify compare tree and sync workflow end-to-end.
Add record-mode API path (content=record).Add fields/forms/events/records/filter/date-range options.Add flat/eav export mode.SeparateapplySharedExportParamsfromapplyRecordOnlyFilters.Add report/records mode toggle to frontend.ExposeexportSurveyFieldsandexportDataAccessGroupsas records-mode toggles.Auto-detect identifier-tagged fields from metadata and pre-blank them.- Add unit tests for each parameter combination.
- Add policy schema and validation.
- Implement field-level transforms (drop/blank/mask/pseudonymize).
- Add optional reversible encryption with key-provider abstraction.
- Add audit/provenance output listing transformed fields and method.
- Add strict safeguards:
- no key logging
- no raw-value logging
- secure defaults
- Define normalized metadata model.
- Implement exporter adapters:
- DDI-CDI
- Croissant
- RO-Crate
- Expose format toggles in UI.
- Add schema validation tests for each output type.
- Performance test with large REDCap projects.
- Security review (keys, logs, PII handling, transport).
- Add operator documentation and troubleshooting.
- Run pilot with limited users.
- Keep
redcapplugin as stable fallback untilredcap2is proven.
↑ Back to Top | → Testing Plan
- Request payload construction for each mode.
- Parsing of report/record responses.
- Virtual node generation and deterministic IDs.
- Hash determinism for unchanged payloads.
- De-identification policy behavior.
- Exporter output shape checks.
compare -> syncwith report mode.compare -> syncwith record mode and filters.- Dataverse ingest compatibility for generated data files.
- DDI-CDI/Croissant/RO-Crate generation from same source snapshot.
- Ensure keys are never logged.
- Verify reversible encryption requires explicit opt-in.
- Validate redaction of error messages containing sensitive values.
↑ Back to Top | → Open Questions
Do all target REDCap instances expose report listing?Resolved: The standard REDCap API does not expose a report-listing endpoint. Report IDs are entered manually.Should record mode be a separate flow or a toggle?Resolved: Implemented as a toggle on the same settings page.- Which de-identification policy should be default at KU Leuven:
- drop identifiers
- blank identifiers
- deterministic pseudonymization
- Are reversible transformations acceptable under institutional policy?
- Should metadata outputs be generated during sync, after sync, or both?
- Should attachments be supported in MVP or deferred?
- REDCap Reports API (report export by report ID):
- REDCap Records API (field/form/event/filter/date params):
- REDCap Metadata API:
- REDCap Instruments API:
- REDCap Events API:
- REDCap File Repository semantics (
folder_idvsdoc_id): - REDCap report export workflow reference: