As part of the OpenDP project, we're adding an integration with the Dataverse Software. Generally, the idea is that a data depositor or curator would use the OpenDP Tool to create a DP metadata release for a restricted file, and then deposit that DP metadata release back to a Dataverse installation as an auxiliary file using the APIs added in 5.3. The DP release would then be accessible for download or use by external tools as appropriate (#7400). The idea is that a researcher would be able to use this DP metadata release to determine whether or not it's worthwhile to request access to the file.
The big issue with this workflow is that we already expose Summary Stats in the DDI Export, so if the non-DP version of the Summary Stats remain available, people wouldn't bother creating the DP release, and if they did, privacy would be spoiled anyway. Not great!
We previously discussed this in #6474 and added some documentation to clarify the current behavior, and we mentioned a few times that we'd have to revisit this availability once we started to support sensitive data. Looks like it's time to revisit.
I'd welcome feedback on the best approach for this, but generally I'm thinking we:
- Remove the export options that expose summary stats and variable names/labels for restricted files (UI and API) for unauthorized users
- Make these summary stats and variable names/labels unavailable through other APIs for unauthorized users
- Add a release note, with a note to run re-exportall
As part of the OpenDP project, we're adding an integration with the Dataverse Software. Generally, the idea is that a data depositor or curator would use the OpenDP Tool to create a DP metadata release for a restricted file, and then deposit that DP metadata release back to a Dataverse installation as an auxiliary file using the APIs added in 5.3. The DP release would then be accessible for download or use by external tools as appropriate (#7400). The idea is that a researcher would be able to use this DP metadata release to determine whether or not it's worthwhile to request access to the file.
The big issue with this workflow is that we already expose Summary Stats in the DDI Export, so if the non-DP version of the Summary Stats remain available, people wouldn't bother creating the DP release, and if they did, privacy would be spoiled anyway. Not great!
We previously discussed this in #6474 and added some documentation to clarify the current behavior, and we mentioned a few times that we'd have to revisit this availability once we started to support sensitive data. Looks like it's time to revisit.
I'd welcome feedback on the best approach for this, but generally I'm thinking we: