diff --git a/doc/release-notes/11639-db-opts-idempotency.md b/doc/release-notes/11639-db-opts-idempotency.md new file mode 100644 index 00000000000..f73cbdebf83 --- /dev/null +++ b/doc/release-notes/11639-db-opts-idempotency.md @@ -0,0 +1,45 @@ +## Database Settings Cleanup + +With this release, we remove some legacy specialties around Database Settings and provide better Admin API endpoints for them. + +Most important changes: + +1. Setting `BuiltinUsers.KEY` was renamed to `:BuiltinUsersKey`, aligned with our general naming pattern for options. +2. Setting `WorkflowsAdmin#IP_WHITELIST_KEY` was renamed to `:WorkflowsAdminIpWhitelist`, aligned with our general naming pattern for options. +3. Setting `:TabularIngestSizeLimit` no longer uses suffixes for formats and becomes a JSON-based setting instead. +4. If set, all three settings will be migrated to their new form automatically for you (Flyway migration). +5. You can no longer (accidentally) create or use arbitrary setting names or languages. + All Admin API endpoints for settings now validate setting names and languages for existence and compliance. + +As an administrator of a Dataverse instance, you can now make use of enhanced Bulk Operations on the Settings Admin API: + +1. Retrieving all settings as JSON via `GET /api/admin/settings` supports localized options now, too. +2. You can replace all existing settings in an idempotent way sending JSON to `PUT /api/admin/settings`. + This will create, update and remove settings as necessary in one atomic operation. + The new endpoint is especially useful to admins using GitOps or other automations. + It allows control over all Database Settings from a single source without risking an undefined state. + +Note: Despite the validation of setting names and languages, the content of any database setting is still not being validated when using the Settings Admin API! + +### Updated Database Settings + +The following database settings are were added to the official list within the code (to remain valid with the settings cleanup mentioned above): + +- `:BagGeneratorThreads` +- `:BagItHandlerEnabled` +- `:BagItLocalPath` +- `:BagValidatorJobPoolSize` +- `:BagValidatorJobWaitInterval` +- `:BagValidatorMaxErrors` +- `:BuiltinUsersKey` - formerly `BuiltinUsers.KEY` +- `:CreateDataFilesMaxErrorsToDisplay` +- `:DRSArchiverConfig` - a Harvard-specific setting +- `:DuraCloudContext` +- `:DuraCloudHost` +- `:DuraCloudPort` +- `:FileCategories` +- `:GoogleCloudBucket` +- `:GoogleCloudProject` +- `:LDNAnnounceRequiredFields` +- `:LDNTarget` +- `:WorkflowsAdminIpWhitelist` - formerly `WorkflowsAdmin#IP_WHITELIST_KEY` diff --git a/doc/release-notes/11744-cors-echo-origin-vary.md b/doc/release-notes/11744-cors-echo-origin-vary.md new file mode 100644 index 00000000000..48eaa3b96f9 --- /dev/null +++ b/doc/release-notes/11744-cors-echo-origin-vary.md @@ -0,0 +1,41 @@ +# 11744: CORS handling improvements + +Modernizes CORS so browser integrations (previewers, external tools, JS clients) work correctly with multiple origins and proper caching. + +## Highlights + +- Echoes the request origin (`Access-Control-Allow-Origin`) when it matches `dataverse.cors.origin`. +- Adds `Vary: Origin` for per-origin responses (not for wildcard). +- Supports comma‑separated origin list; any `*` in the list = wildcard mode. +- CORS now only enabled when `dataverse.cors.origin` is set (removed `:AllowCors` no longer enables it). +- All comma-separated configuration settings (database properties and MicroProfile config) now ignore spaces around commas; tokens remain unchanged (no quote parsing). Examples: `dataverse.cors.methods`, `dataverse.cors.headers.allow`, `dataverse.cors.headers.expose`. See "Comma-separated configuration values" in the Installation Guide. +- Docs updated (Installation, Big Data Support, External Tools, File Previews); new tests cover edge cases. + +## Admin Action + +Set `dataverse.cors.origin` explicitly (required). Use explicit origins (not `*`) for credentialed requests. Ensure proxies keep `Vary: Origin`. + +Examples: + +``` +dataverse.cors.origin=https://example.org +dataverse.cors.origin=https://libis.github.io,https://gdcc.github.io +dataverse.cors.origin=* +``` + +Optional (unquoted): + +``` +dataverse.cors.methods=GET, POST, OPTIONS, PUT, DELETE +``` + +## Compatibility + +- Must configure `dataverse.cors.origin`; `:AllowCors` was deprecated and has now been removed. +- Any `*` triggers wildcard (no per-origin echo / no Vary header). + +## Docs + +See updated `dataverse.cors.origin` section and related notes in Big Data Support (S3), External Tools, and File Previews. + + diff --git a/doc/release-notes/7618-file-level-permissions-restricted-draft.md b/doc/release-notes/7618-file-level-permissions-restricted-draft.md new file mode 100644 index 00000000000..6b674ee7ec3 --- /dev/null +++ b/doc/release-notes/7618-file-level-permissions-restricted-draft.md @@ -0,0 +1,3 @@ +File level permissions: Restricted files in Draft will now show a "Draft/Unpublished" tag in the UI when granting file access + +See #7618 diff --git a/doc/release-notes/metadataLanguage-API-call.md b/doc/release-notes/metadataLanguage-API-call.md new file mode 100644 index 00000000000..9e2598cf159 --- /dev/null +++ b/doc/release-notes/metadataLanguage-API-call.md @@ -0,0 +1,6 @@ +A new API endpoint has been implemented for getting the metadata language of a Dataverse Collection: + +`GET /dataverses/{alias}/allowedMetadataLanguages`: Returns the specified metadata language(s) in the collection if any. +`PUT /dataverses/{alias}/allowedMetadataLanguages{metadataLanguage}`: Sets a metadata language in the collection. + +For more information, see #11856 and #11856. \ No newline at end of file diff --git a/doc/sphinx-guides/source/api/changelog.rst b/doc/sphinx-guides/source/api/changelog.rst index d6523bfbdbc..8ba0ce00010 100644 --- a/doc/sphinx-guides/source/api/changelog.rst +++ b/doc/sphinx-guides/source/api/changelog.rst @@ -9,7 +9,10 @@ This API changelog is experimental and we would love feedback on its usefulness. v6.9 ---- + - The POST /api/admin/makeDataCount/{id}/updateCitationsForDataset processing is now asynchronous and the response no longer includes the number of citations. The response can be OK if the request is queued or 503 if the queue is full (default queue size is 1000). +- The way to set per-format size limits for tabular ingest has changed. JSON input is now used. See :ref:`:TabularIngestSizeLimit`. +- In the past, the settings API would accept any key and value. This is no longer the case because validation has been added. See :ref:`settings_put_single`, for example. v6.8 ---- diff --git a/doc/sphinx-guides/source/api/external-tools.rst b/doc/sphinx-guides/source/api/external-tools.rst index 389519318db..57a98a0c7c2 100644 --- a/doc/sphinx-guides/source/api/external-tools.rst +++ b/doc/sphinx-guides/source/api/external-tools.rst @@ -11,6 +11,9 @@ Introduction External tools are additional applications the user can access or open from your Dataverse installation to preview, explore, and manipulate data files and datasets. The term "external" is used to indicate that the tool is not part of the main Dataverse Software. +.. note:: + Browser-based tools must have CORS explicitly enabled via :ref:`dataverse.cors.origin `. List every origin that will host your tool (or use ``*`` when a wildcard is acceptable). If an origin is not listed, the browser will block that tool's API requests even if the tool page itself loads. + Once you have created the external tool itself (which is most of the work!), you need to teach a Dataverse installation how to construct URLs that your tool needs to operate. For example, if you've deployed your tool to fabulousfiletool.com your tool might want the ID of a file and the siteUrl of the Dataverse installation like this: https://fabulousfiletool.com?fileId=42&siteUrl=https://demo.dataverse.org In short, you will be creating a manifest in JSON format that describes not only how to construct URLs for your tool, but also what types of files your tool operates on, where it should appear in the Dataverse installation web interfaces, etc. diff --git a/doc/sphinx-guides/source/api/native-api.rst b/doc/sphinx-guides/source/api/native-api.rst index c3870372614..db9a408083a 100644 --- a/doc/sphinx-guides/source/api/native-api.rst +++ b/doc/sphinx-guides/source/api/native-api.rst @@ -287,6 +287,51 @@ The fully expanded example above (without environment variables) looks like this curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" "https://demo.dataverse.org/api/dataverses/root/roles" +List the Allowed Metadata Languages of a Dataverse Collection +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Shows the allowed metadata languages of the Dataverse collection ``id``: + +.. code-block:: bash + + export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx + export SERVER_URL=https://demo.dataverse.org + export ID=root + + curl -H "X-Dataverse-key:$API_TOKEN" "$SERVER_URL/api/dataverses/$ID/allowedMetadataLanguages" + +The fully expanded example above (without environment variables) looks like this: + +.. code-block:: bash + + curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" "https://demo.dataverse.org/api/dataverses/root/allowedMetadataLanguages" + +If there are no metadata languages configured on the server, this call returns an empty array. If the Dataverse collection has a mandatory metadata language, the return value is an array of that single language, +otherwise it's an array of all available metadata languages on the server. + +Set the Allowed Metadata Language of a Dataverse Collection +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Sets the allowed metadata language of the Dataverse collection ``id`` to ``langCode`` if it's available on the server: + +.. code-block:: bash + + export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx + export SERVER_URL=https://demo.dataverse.org + export ID=root + export LANGCODE=en + + curl -H "X-Dataverse-key:$API_TOKEN" -X PUT "$SERVER_URL/api/dataverses/$ID/allowedMetadataLanguages/$LANGCODE" + +The fully expanded example above (without environment variables) looks like this: + +.. code-block:: bash + + curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X PUT "https://demo.dataverse.org/api/dataverses/root/allowedMetadataLanguages/en" + +Returns an array of the set metadata language. +If the metadata language is not available on the server, this call responds with a 400 BAD REQUEST. + List Facets Configured for a Dataverse Collection ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -5843,13 +5888,13 @@ Builtin users are known as "Username/Email and Password" users in the :doc:`/use Create a Builtin User ~~~~~~~~~~~~~~~~~~~~~ -For security reasons, builtin users cannot be created via API unless the team who runs the Dataverse installation has populated a database setting called ``BuiltinUsers.KEY``, which is described under :ref:`securing-your-installation` and :ref:`database-settings` sections of Configuration in the Installation Guide. You will need to know the value of ``BuiltinUsers.KEY`` before you can proceed. +For security reasons, builtin users cannot be created via API unless the team who runs the Dataverse installation has populated a database setting called ``:BuiltinUsersKey``, which is described under :ref:`securing-your-installation` and :ref:`database-settings` sections of Configuration in the Installation Guide. You will need to know the value of ``:BuiltinUsersKey`` before you can proceed. To create a builtin user via API, you must first construct a JSON document. You can download :download:`user-add.json <../_static/api/user-add.json>` or copy the text below as a starting point and edit as necessary. .. literalinclude:: ../_static/api/user-add.json -Place this ``user-add.json`` file in your current directory and run the following curl command, substituting variables as necessary. Note that both the password of the new user and the value of ``BuiltinUsers.KEY`` are passed as query parameters:: +Place this ``user-add.json`` file in your current directory and run the following curl command, substituting variables as necessary. Note that both the password of the new user and the value of ``:BuiltinUsersKey`` are passed as query parameters:: curl -d @user-add.json -H "Content-type:application/json" "$SERVER_URL/api/builtin-users?password=$NEWUSER_PASSWORD&key=$BUILTIN_USERS_KEY" @@ -7133,35 +7178,193 @@ If the PID is not managed by Dataverse, this call will report if the PID is reco Admin ----- -This is the administrative part of the API. For security reasons, it is absolutely essential that you block it before allowing public access to a Dataverse installation. Blocking can be done using settings. See the ``post-install-api-block.sh`` script in the ``scripts/api`` folder for details. See :ref:`blocking-api-endpoints` in Securing Your Installation section of the Configuration page of the Installation Guide. +This is the administrative part of the API. +For security reasons, it is absolutely essential that you block it before allowing public access to a Dataverse installation. +See :ref:`blocking-api-endpoints` in the Installation Guide for details. + +.. note:: See :ref:`curl-examples-and-environment-variables` if you are unfamiliar with the use of export below. + +.. _admin-api-db-settings: + +Manage Database Settings +~~~~~~~~~~~~~~~~~~~~~~~~ + +These are the API endpoints for managing the :ref:`database-settings` listed in the Installation Guide. + +.. _settings_get_all: List All Database Settings -~~~~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^^^ -List all settings:: +.. code-block:: bash - GET http://$SERVER/api/admin/settings + export SERVER_URL="http://localhost:8080" + + curl "$SERVER_URL/api/admin/settings" -Configure Database Setting -~~~~~~~~~~~~~~~~~~~~~~~~~~ +The fully expanded example above (without environment variables) looks like this: + +.. code-block:: bash -Sets setting ``name`` to the body of the request:: + curl http://localhost:8080/api/admin/settings - PUT http://$SERVER/api/admin/settings/$name +.. _settings_get_single: Get Single Database Setting -~~~~~~~~~~~~~~~~~~~~~~~~~~~ +^^^^^^^^^^^^^^^^^^^^^^^^^^^ -Get the setting under ``name``:: +.. code-block:: bash - GET http://$SERVER/api/admin/settings/$name + export SERVER_URL="http://localhost:8080" + export NAME=":UploadMethods" + + curl "$SERVER_URL/api/admin/settings/$NAME" -Delete Database Setting -~~~~~~~~~~~~~~~~~~~~~~~ +The fully expanded example above (without environment variables) looks like this: + +.. code-block:: bash + + curl http://localhost:8080/api/admin/settings/:UploadMethods + +.. _settings_get_single_lang: + +Get Single Database Setting With Language/Locale +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +A small number of settings, most notably :ref:`:ApplicationTermsOfUse`, can be saved in multiple languages. + +Use two-character ISO 639-1 language codes. + +.. code-block:: bash + + export SERVER_URL="http://localhost:8080" + export NAME=":ApplicationTermsOfUse" + export LANG="en" + + curl "$SERVER_URL/api/admin/settings/$NAME/lang/$LANG" + +The fully expanded example above (without environment variables) looks like this: + +.. code-block:: bash + + curl http://localhost:8080/api/admin/settings/:ApplicationTermsOfUse/lang/en -Delete the setting under ``name``:: +.. _settings_put_single: + +Configure Single Database Setting +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: bash + + export SERVER_URL="http://localhost:8080" + export NAME=":InstallationName" + export VALUE="LibreScholar" + + curl -X PUT "$SERVER_URL/api/admin/settings/$NAME" -d "$VALUE" + +The fully expanded example above (without environment variables) looks like this: + +.. code-block:: bash + + curl -X PUT http://localhost:8080/api/admin/settings/:InstallationName -d LibreScholar + +Note: ``NAME`` values are validated for existence and compliance. + +.. _settings_put_single_lang: + +Configure Single Database Setting With Language/Locale +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +A small number of settings, most notably :ref:`:ApplicationTermsOfUse`, can be saved in multiple languages. + +Use two-character ISO 639-1 language codes. + +.. code-block:: bash + + export SERVER_URL="http://localhost:8080" + export NAME=":ApplicationTermsOfUse" + export LANG="fr" + + curl -X PUT "$SERVER_URL/api/admin/settings/$NAME/lang/$LANG" --upload-file /tmp/apptou_fr.html + +The fully expanded example above (without environment variables) looks like this: + +.. code-block:: bash + + curl -X PUT http://localhost:8080/api/admin/settings/:ApplicationTermsOfUse/lang/fr --upload-file /tmp/apptou_fr.html + +Note: ``NAME`` and ``LANG`` values are validated for existence and compliance. + +.. _settings_put_bulk: + +Configure All Database Settings +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Using a JSON file, replace all settings in a single idempotent and atomic operation and delete any settings not present in that JSON file. + +Use the JSON ``data`` object in output of ``GET /api/admin/settings`` (:ref:`settings_get_all`) for the JSON input structure for this endpoint. +To put this concretely, you can save just the ``data`` object for your existing settings to disk by filtering them through ``jq`` like this: + +.. code-block:: bash + + curl http://localhost:8080/api/admin/settings | jq '.data' > /tmp/all-settings.json + +Then you can use this "all-settings.json" file as a starting point for your input file. +The :doc:`../installation/config` page of the Installation Guide has a :ref:`complete list of all the available settings `. +Note that settings in the JSON file are validated for existence and compliance. + +.. code-block:: bash + + export SERVER_URL="http://localhost:8080" + + curl -X PUT -H "Content-type:application/json" "$SERVER_URL/api/admin/settings" --upload-file /tmp/all-settings.json + +The fully expanded example above (without environment variables) looks like this: + +.. code-block:: bash + + curl -X PUT -H "Content-type:application/json" http://localhost:8080/api/admin/settings --upload-file /tmp/all-settings.json + +.. _settings_delete_single: + +Delete Single Database Setting +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: bash + + export SERVER_URL="http://localhost:8080" + export NAME=":InstallationName" + + curl -X DELETE "$SERVER_URL/api/admin/settings/$NAME" + +The fully expanded example above (without environment variables) looks like this: + +.. code-block:: bash + + curl -X DELETE http://localhost:8080/api/admin/settings/:InstallationName + +.. _settings_delete_single_lang: + +Delete Single Database Setting With Language/Locale +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +A small number of settings, most notably :ref:`:ApplicationTermsOfUse`, can be saved in multiple languages. + +Use two-character ISO 639-1 language codes. + +.. code-block:: bash + + export SERVER_URL="http://localhost:8080" + export NAME=":ApplicationTermsOfUse" + export LANG="fr" + + curl -X DELETE "$SERVER_URL/api/admin/settings/$NAME/lang/$LANG" + +The fully expanded example above (without environment variables) looks like this: + +.. code-block:: bash - DELETE http://$SERVER/api/admin/settings/$name + curl -X DELETE http://localhost:8080/api/admin/settings/:ApplicationTermsOfUse/lang/fr .. _list-all-feature-flags: diff --git a/doc/sphinx-guides/source/developers/big-data-support.rst b/doc/sphinx-guides/source/developers/big-data-support.rst index 75a50e2513d..ef13143be02 100644 --- a/doc/sphinx-guides/source/developers/big-data-support.rst +++ b/doc/sphinx-guides/source/developers/big-data-support.rst @@ -57,6 +57,15 @@ Allow CORS for S3 Buckets **IMPORTANT:** One additional step that is required to enable direct uploads via a Dataverse installation and for direct download to work with previewers and direct upload to work with dvwebloader (:ref:`folder-upload`) is to allow cross site (CORS) requests on your S3 store. The example below shows how to enable CORS rules (to support upload and download) on a bucket using the AWS CLI command line tool. Note that you may want to limit the AllowedOrigins and/or AllowedHeaders further. https://github.com/gdcc/dataverse-previewers/wiki/Using-Previewers-with-download-redirects-from-S3 has some additional information about doing this. +Dataverse itself will only emit the necessary ``Access-Control-*`` headers to browsers when CORS has been explicitly enabled via the JVM/MicroProfile setting :ref:`dataverse.cors.origin `. You must both: + +* Configure an appropriate ``dataverse.cors.origin`` value (single origin, comma-separated list, or ``*``) on the Dataverse application server; and +* Configure a matching/compatible CORS policy on each S3 bucket (and any CDN/proxy in front of it) that will be used for direct upload or for redirect (download-redirect) operations consumed by previewers. + +If you specify multiple origins in ``dataverse.cors.origin`` Dataverse will echo back the requesting origin (when it matches) and will include ``Vary: Origin`` so that shared caches do not serve one origin's response to another. If you configure ``*`` Dataverse will respond with ``Access-Control-Allow-Origin: *`` (note that browsers will not allow credentialed requests with a wildcard). + +Make sure the bucket CORS configuration ``AllowedOrigins`` is at least as permissive as the origins you configure in ``dataverse.cors.origin``. If the bucket allows ``*`` but the Dataverse application only allows a subset, the browser will still enforce the more restrictive application response. + If you'd like to check the CORS configuration on your bucket before making changes: ``aws s3api get-bucket-cors --bucket `` diff --git a/doc/sphinx-guides/source/developers/testing.rst b/doc/sphinx-guides/source/developers/testing.rst index f84a7cf1ac7..733a0b0ba28 100755 --- a/doc/sphinx-guides/source/developers/testing.rst +++ b/doc/sphinx-guides/source/developers/testing.rst @@ -209,7 +209,7 @@ The Burrito Key For reasons that have been lost to the mists of time, the Dataverse software really wants you to to have a burrito. Specifically, if you're trying to run REST Assured tests and see the error "Dataverse config issue: No API key defined for built in user management", you must run the following curl command (or make an equivalent change to your database): -``curl -X PUT -d 'burrito' http://localhost:8080/api/admin/settings/BuiltinUsers.KEY`` +``curl -X PUT -d 'burrito' http://localhost:8080/api/admin/settings/:BuiltinUsersKey`` Without this "burrito" key in place, REST Assured will not be able to create users. We create users to create objects we want to test, such as collections, datasets, and files. diff --git a/doc/sphinx-guides/source/installation/config.rst b/doc/sphinx-guides/source/installation/config.rst index 0866da892ce..6c19464489d 100644 --- a/doc/sphinx-guides/source/installation/config.rst +++ b/doc/sphinx-guides/source/installation/config.rst @@ -10,6 +10,27 @@ Once you have finished securing and configuring your Dataverse installation, you .. contents:: |toctitle| :local: +.. _comma-separated-config-values: + +Comma-separated configuration values +------------------------------------ + +Many configuration options (both MicroProfile/JVM settings and database settings) accept comma-separated lists. For all such settings, Dataverse applies consistent, lightweight parsing: + +- Whitespace immediately around commas is ignored (e.g., ``GET, POST`` is equivalent to ``GET,POST``). +- Tokens are otherwise preserved exactly as typed. There is no quote parsing and no escape processing. +- Embedded commas within a token are not supported. + +Examples include (but are not limited to): + +- :ref:`dataverse.cors.origin ` +- :ref:`dataverse.cors.methods ` +- :ref:`dataverse.cors.headers.allow ` +- :ref:`dataverse.cors.headers.expose ` +- :ref:`:UploadMethods` + +This behavior is implemented centrally and applies across all Dataverse settings that accept comma-separated values. + .. _securing-your-installation: Securing Your Installation @@ -25,7 +46,7 @@ The default password for the "dataverseAdmin" superuser account is "admin", as m Blocking API Endpoints ++++++++++++++++++++++ -The :doc:`/api/native-api` contains a useful but potentially dangerous set of API endpoints called "admin" that allows you to change system settings, make ordinary users into superusers, and more. The "builtin-users" endpoints let admins do tasks such as creating a local/builtin user account if they know the key defined in :ref:`BuiltinUsers.KEY`. +The :doc:`/api/native-api` contains a useful but potentially dangerous set of API endpoints called "admin" that allows you to change system settings, make ordinary users into superusers, and more. The "builtin-users" endpoints let admins do tasks such as creating a local/builtin user account if they know the key defined in :ref:`:BuiltinUsersKey`. By default in the code, most of these API endpoints can be operated on remotely and a number of endpoints do not require authentication. However, the endpoints "admin" and "builtin-users" are limited to localhost out of the box by the installer, using the JvmSettings :ref:`dataverse.api.blocked.endpoints` and :ref:`dataverse.api.blocked.policy`. @@ -807,7 +828,7 @@ Both Local and Remote Auth The ``authenticationproviderrow`` database table controls which "authentication providers" are available within a Dataverse installation. Out of the box, a single row with an id of "builtin" will be present. For each user in a Dataverse installation, the ``authenticateduserlookup`` table will have a value under ``authenticationproviderid`` that matches this id. For example, the default "dataverseAdmin" user will have the value "builtin" under ``authenticationproviderid``. Why is this important? Users are tied to a specific authentication provider but conversion mechanisms are available to switch a user from one authentication provider to the other. As explained in the :doc:`/user/account` section of the User Guide, a graphical workflow is provided for end users to convert from the "builtin" authentication provider to a remote provider. Conversion from a remote authentication provider to the builtin provider can be performed by a sysadmin with access to the "admin" API. See the :doc:`/api/native-api` section of the API Guide for how to list users and authentication providers as JSON. -Adding and enabling a second authentication provider (:ref:`native-api-add-auth-provider` and :ref:`api-toggle-auth-provider`) will result in the Log In page showing additional providers for your users to choose from. By default, the Log In page will show the "builtin" provider, but you can adjust this via the :ref:`conf-default-auth-provider` configuration option. Further customization can be achieved by setting :ref:`conf-allow-signup` to "false", thus preventing users from creating local accounts via the web interface. Please note that local accounts can also be created through the API by enabling the ``builtin-users`` endpoint (:ref:`:BlockedApiEndpoints`) and setting the ``BuiltinUsers.KEY`` database setting (:ref:`BuiltinUsers.KEY`). +Adding and enabling a second authentication provider (:ref:`native-api-add-auth-provider` and :ref:`api-toggle-auth-provider`) will result in the Log In page showing additional providers for your users to choose from. By default, the Log In page will show the "builtin" provider, but you can adjust this via the :ref:`conf-default-auth-provider` configuration option. Further customization can be achieved by setting :ref:`conf-allow-signup` to "false", thus preventing users from creating local accounts via the web interface. Please note that local accounts can also be created through the API by enabling the ``builtin-users`` endpoint (:ref:`:BlockedApiEndpoints`) and setting the ``:BuiltinUsersKey`` database setting (:ref:`:BuiltinUsersKey`). To configure Shibboleth see the :doc:`shibboleth` section and to configure OAuth see the :doc:`oauth2` section. @@ -3704,10 +3725,9 @@ The following settings control Cross-Origin Resource Sharing (CORS) for your Dat dataverse.cors.origin +++++++++++++++++++++ -Allowed origins for CORS requests. The default with no value set is to not include CORS headers. However, if the deprecated :AllowCors setting is explicitly set to true the default is "\*" (all origins). -When the :AllowsCors setting is not used, you must set this setting to "\*" or a list of origins to enable CORS headers. +Allowed origins for CORS requests. If this setting is not defined, CORS headers are not added. Set to ``*`` to allow all origins (note that browsers will not allow credentialed requests with ``*``) or provide a comma-separated list of explicit origins. -Multiple origins can be specified as a comma-separated list. +Multiple origins can be specified as a comma-separated list (whitespace is ignored): Example: @@ -3715,6 +3735,11 @@ Example: Can also be set via any `supported MicroProfile Config API source`_, e.g. the environment variable ``DATAVERSE_CORS_ORIGIN``. +Behavior: + +* When a list of origins is configured, Dataverse echoes the single matching request ``Origin`` value in ``Access-Control-Allow-Origin`` and adds ``Vary: Origin`` to support correct proxy/CDN caching. +* When ``*`` is configured, ``Access-Control-Allow-Origin: *`` is sent and ``Vary`` is not modified. + .. _dataverse.cors.methods: dataverse.cors.methods @@ -3921,11 +3946,14 @@ You might also create your own profiles and use these, please refer to the upstr Database Settings ----------------- -These settings are stored in the ``setting`` database table but can be read and modified via the "admin" endpoint of the :doc:`/api/native-api` for easy scripting. +These settings are stored in the ``setting`` database table but we recommend using the Settings Admin API (:ref:`admin-api-db-settings`) to view and modify them, as shown below. +If changed in the database directly, you need to reload the application to make the ORM pickup the changes. -The most commonly used configuration options are listed first. +In short: -The pattern you will observe in curl examples below is that an HTTP ``PUT`` is used to add or modify a setting. If you perform an HTTP ``GET`` (the default when using curl), the output will contain the value of the setting, if it has been set. You can also do a ``GET`` of all settings with ``curl http://localhost:8080/api/admin/settings`` which you may want to pretty-print by piping the output through a tool such as jq by appending ``| jq .``. If you want to remove a setting, use an HTTP ``DELETE`` such as ``curl -X DELETE http://localhost:8080/api/admin/settings/:GuidesBaseUrl`` . +- HTTP ``GET`` is used to show settings. +- HTTP ``PUT`` is used to add or modify settings. +- HTTP ``DELETE`` is used to delete settings. .. _:BlockedApiPolicy: @@ -3981,14 +4009,16 @@ Now that ``:BlockedApiKey`` has been enabled, blocked APIs can be accessed using ``curl https://demo.dataverse.org/api/admin/settings?unblock-key=theKeyYouChose`` -.. _BuiltinUsers.KEY: +.. _:BuiltinUsersKey: -BuiltinUsers.KEY +:BuiltinUsersKey ++++++++++++++++ The key required to create users via API as documented at :doc:`/api/native-api`. Unlike other database settings, this one doesn't start with a colon. -``curl -X PUT -d builtInS3kretKey http://localhost:8080/api/admin/settings/BuiltinUsers.KEY`` +``curl -X PUT -d builtInS3kretKey http://localhost:8080/api/admin/settings/:BuiltinUsersKey`` + +Note: this key used to be named ``BuiltinUsers.KEY`` until Dataverse 6.8. :SearchApiRequiresToken +++++++++++++++++++++++ @@ -4418,33 +4448,65 @@ For performance reasons, your Dataverse installation will only allow creation of In the UI, users trying to download a zip file larger than the Dataverse installation's :ZipDownloadLimit will receive messaging that the zip file is too large, and the user will be presented with alternate access options. +.. _:TabularIngestSizeLimit: + :TabularIngestSizeLimit +++++++++++++++++++++++ -Threshold in bytes for limiting whether or not "ingest" it attempted for tabular files (which can be resource intensive). For example, with the below in place, files greater than 2 GB in size will not go through the ingest process: +Threshold in bytes for limiting whether or not "ingest" is attempted for an uploaded tabular file (which can be resource intensive). +For more on the ingest feature, see :doc:`/user/tabulardataingest/index` in the User Guide. + +There are two ways to specify ingest size limits. You can set a global limit for all file types or you can use a JSON file for more granularity. We'll cover the global limit first. + +With the following value in place (again, expressed in bytes), files greater than 2 GB in size will not go through the ingest process: ``curl -X PUT -d 2000000000 http://localhost:8080/api/admin/settings/:TabularIngestSizeLimit`` -(You can set this value to 0 to prevent files from being ingested at all.) +You can set this value to ``0`` to prevent files from being ingested at all. -You can override this global setting on a per-format basis for the following formats: +Out of the box, the ``:TabularIngestSizeLimit`` setting is absent, which results in ingest being attempted no matter how large the file is. You can specify this "no size limit" default explicitly with the value ``-1``. +Using a JSON-based setting, you can set a global default and per-format limits for the following formats: + +- CSV - DTA - POR -- SAV - Rdata -- CSV -- XLSX (in lower-case) +- SAV +- XLSX -For example : +(In previous releases of Dataverse, a colon-separated form was used to specify per-format limits, such as ``:TabularIngestSizeLimit:Rdata``, but this is no longer supported. Now JSON is used.) -* if you want your Dataverse installation to not attempt to ingest Rdata files larger than 1 MB, use this setting: +The expected JSON is an object with key/value pairs like the following. Format names are case-insensitive, and all fields are optional. The size limits must be strings with double quotes around them (e.g. ``"10"``) rather than numbers (e.g. ``10``). -``curl -X PUT -d 1000000 http://localhost:8080/api/admin/settings/:TabularIngestSizeLimit:Rdata`` +.. code:: json -* if you want your Dataverse installation to not attempt to ingest XLSX files at all, use this setting: + { + "default": "-1", + "csv": "0", + "dta": "10", + "por": "100" + } + +Whatever JSON you send will overwrite existing values. If you have any exiting ``:TabularIngestSizeLimit`` settings, you can use the following command to see them in the expected input format above (and then add the new settings you want): + +``curl http://localhost:8080/api/admin/settings/:TabularIngestSizeLimit | jq -r '.data.message'`` + +The ``default`` key is optional and can be used to give limits to formats that are not specified in the JSON. If you omit the ``default`` key or set it to ``"-1"``, no limits are applied to formats not specified in the JSON. If you set it to ``"0"``, ingest will be disabled (but you can override this per-format). + +Add a format name (``csv``, ``dta``, etc., as listed above) to change the limit for that particular format. + +Examples: + +1. If you want your Dataverse installation to not attempt to ingest Rdata files larger than 1 MB but otherwise be unlimited: + + ``curl -X PUT -d '{"Rdata":"1000000"}' http://localhost:8080/api/admin/settings/:TabularIngestSizeLimit`` +2. If you want your Dataverse installation to not attempt to ingest XLSX files at all and apply a global limit of 512 MiB, use this setting: -``curl -X PUT -d 0 http://localhost:8080/api/admin/settings/:TabularIngestSizeLimit:xlsx`` + ``curl -X PUT -d '{"default":"536870912", "XSLX":"0"}' http://localhost:8080/api/admin/settings/:TabularIngestSizeLimit`` +3. If you want your Dataverse installation to not attempt to ingest files at all except for CSV files that are 256 MiB or smaller, use this setting: + + ``curl -X PUT -d '{"default":"0", "CSV":"268435456"}' http://localhost:8080/api/admin/settings/:TabularIngestSizeLimit`` :ZipUploadFilesLimit ++++++++++++++++++++ @@ -4991,20 +5053,6 @@ This can be helpful in situations where multiple organizations are sharing one D or ``curl -X PUT -d '*' http://localhost:8080/api/admin/settings/:InheritParentRoleAssignments`` -:AllowCors (Deprecated) -+++++++++++++++++++++++ - -.. note:: - This setting is deprecated. Please use the JVM settings above instead. - This legacy setting will only be used if the newer JVM settings are not set. - -Enable or disable support for Cross-Origin Resource Sharing (CORS) by setting ``:AllowCors`` to ``true`` or ``false``. - -``curl -X PUT -d true http://localhost:8080/api/admin/settings/:AllowCors`` - -.. note:: - New values for this setting will only be used after a server restart. - :ChronologicalDateFacets ++++++++++++++++++++++++ diff --git a/doc/sphinx-guides/source/user/dataset-management.rst b/doc/sphinx-guides/source/user/dataset-management.rst index 0802d1255b6..ff8cbe79a46 100755 --- a/doc/sphinx-guides/source/user/dataset-management.rst +++ b/doc/sphinx-guides/source/user/dataset-management.rst @@ -175,6 +175,9 @@ File Previews Dataverse installations can add previewers for common file types uploaded by their research communities. The previews appear on the file page. If a preview tool for a specific file type is available, the preview will be created and will display automatically, after terms have been agreed to or a guestbook entry has been made, if necessary. File previews are not available for restricted files unless they are being accessed using a Preview URL. See also :ref:`previewUrl`. When the dataset license is not the default license, users will be prompted to accept the license/data use agreement before the preview is shown. See also :ref:`license-terms`. +.. note:: + Some previewers run purely in the browser and make direct (JavaScript) requests back to the Dataverse API endpoints to retrieve file contents, metadata, or signed URLs. For these previewers to function when hosted on a different origin (e.g., a CDN or a separate previewer service), the Dataverse installation must have CORS enabled via :ref:`dataverse.cors.origin `. Administrators should configure the list of allowed origins to include the host serving the previewers. + Previewers are available for the following file types: - Text diff --git a/modules/container-configbaker/Dockerfile b/modules/container-configbaker/Dockerfile index 5532cda1a9e..9fc876a283b 100644 --- a/modules/container-configbaker/Dockerfile +++ b/modules/container-configbaker/Dockerfile @@ -23,6 +23,8 @@ ENV PATH="${PATH}:${SCRIPT_DIR}" \ ARG PKGS="bc curl dnsutils dumb-init ed jq netcat-openbsd postgresql-client" # renovate: datasource=github-releases depName=wait4x/wait4x ARG WAIT4X_VERSION="v3.2.0" +# renove: datasource=github-releases depName=mikefarah/yq +ARG YQ_VERSION="v4.47.1" # renovate: datasource=pypi depName=awscli ARG AWSCLI_VERSION="1.40.15" ARG PYTHON_PKGS="awscli==${AWSCLI_VERSION}" @@ -65,7 +67,11 @@ RUN true && \ echo "$(cat /tmp/w4x-checksum | cut -f1 -d" ") /usr/bin/wait4x.tar.gz" | sha256sum -c - && \ tar -xzf /usr/bin/wait4x.tar.gz -C /usr/bin && chmod +x /usr/bin/wait4x && \ - # 2. Python packages + # 2. yq-go \ + curl -sSfL -o /usr/bin/yq "https://github.com/mikefarah/yq/releases/download/${YQ_VERSION}/yq_linux_${ARCH}" && \ + chmod +x /usr/bin/yq && \ + + # 3. Python packages pipx install --global ${PYTHON_PKGS} # Get in the scripts @@ -81,7 +87,7 @@ COPY --from=solr /opt/solr/server/solr/configsets/_default ${SOLR_TEMPLATE}/ COPY maven/solr/*.xml ${SOLR_TEMPLATE}/conf/ RUN rm ${SOLR_TEMPLATE}/conf/managed-schema.xml - +WORKDIR ${SCRIPT_DIR} # Set the entrypoint to tini (as a process supervisor) ENTRYPOINT ["/usr/bin/dumb-init", "--"] # By default run a script that will print a help message and terminate diff --git a/modules/container-configbaker/scripts/apply-db-settings.sh b/modules/container-configbaker/scripts/apply-db-settings.sh new file mode 100755 index 00000000000..deb897d138c --- /dev/null +++ b/modules/container-configbaker/scripts/apply-db-settings.sh @@ -0,0 +1,135 @@ +#!/usr/bin/env bash + +# [INFO]: Idempotent replacement of all database settings from a file source. + +set -euo pipefail + +function usage() { + echo "Usage: $(basename "$0") [-h] [-u instanceUrl] [-t timeout] [-c configFile] [-b unblockKey] [-e envSource]" + echo "" + echo "Replace all Database Settings in a running Dataverse installation in an idempotent way." + echo "" + echo "Parameters:" + echo "instanceUrl - Location on container network where to reach your instance. Default: 'http://dataverse:8080'" + echo " Can be set as environment variable 'DATAVERSE_URL'." + echo " timeout - Provide how long to wait for the instance to become available (using wait4x). Default: '3m'" + echo " Can be set as environment variable 'TIMEOUT'." + echo " configFile - Path to a JSON, YAML, PROPERTIES or TOML file containing your settings. Default: '/dv/db-opts.yml'" + echo " Can be set as environment variable 'CONFIG_FILE'. May contain \${var} references to env. vars." + echo " unblockKey - Either string or path to a file with the Admin API Unblock Key. Optional for localhost. No default." + echo " Can be set as environment variable 'ADMIN_API_UNBLOCK_KEY'." + echo " envSource - Path to a file or directory used as source for additional environment variables." + echo " Optional, no default. Can be set as environment variable 'ENV_SOURCE'." + echo " Environment variables from this file or directory structure will be script-local." + echo "" + echo "Note: This script will wait for the Dataverse instance to be available before executing the replacement." + echo " Be careful - this script will not stop you from deleting any vital settings." + echo "" + exit 1 +} + +source util/common.sh +source util/read-to-env.sh + +# Check for (the right) yq, jq, and wait4x being available +require_on_path yq +if ! grep -q "https://github.com/mikefarah/yq" <((yq --version)); then + error "You must install yq from https://github.com/mikefarah/yq, not https://github.com/kislyuk/yq" +fi +require_on_path jq +require_on_path wait4x + +# Set some defaults as documented +DATAVERSE_URL=${DATAVERSE_URL:-"http://dataverse:8080"} +ADMIN_API_UNBLOCK_KEY=${ADMIN_API_UNBLOCK_KEY:-""} +TIMEOUT=${TIMEOUT:-"3m"} +CONFIG_FILE=${CONFIG_FILE:-"/dv/db-opts.yml"} +ENV_SOURCE=${ENV_SOURCE:-""} + +while getopts "u:t:c:b:e:h" OPTION +do + case "$OPTION" in + u) DATAVERSE_URL="$OPTARG" ;; + t) TIMEOUT="$OPTARG" ;; + c) CONFIG_FILE="$OPTARG" ;; + b) ADMIN_API_UNBLOCK_KEY="$OPTARG" ;; + e) ENV_SOURCE="$OPTARG" ;; + h) usage;; + \?) usage;; + esac +done +shift $((OPTIND-1)) + +##### ##### ##### ##### ##### ##### ##### ##### ##### ##### ##### ##### ##### ##### ##### +# PARSE CONFIGURATION + +# In case the env source was given as cmd arg, parse it +if [ -n "$ENV_SOURCE" ]; then + read_to_env "$ENV_SOURCE" +fi + +# Check for file with DB options given, file present and readable as well as parseable by yq +# If parseable, render as JSON to temp file +CONV_CONF_FILE=$(mktemp) +if [ -f "${CONFIG_FILE}" ] && [ -r "${CONFIG_FILE}" ]; then + # See https://mikefarah.gitbook.io/yq/operators/env-variable-operators#tip + yq -M -o json '(.. | select(tag == "!!str")) |= envsubst(nu)' "${CONFIG_FILE}" > "${CONV_CONF_FILE}" || error "Could not parse config file with yq from '${CONFIG_FILE}'." + # TODO: think about adding a debug switch here, not just print + # cat "$CONV_CONF_FILE" +else + error "Could not read a config file at '${CONFIG_FILE}'." +fi + +##### ##### ##### ##### ##### ##### ##### ##### ##### ##### ##### ##### ##### ##### ##### +# API INTERACTION + +# Define an auth header argument (enabling usage of different ways) +AUTH_HEADER_ARG="" + +# Check for Dataverse Unblock API Key present (option with file/env var) +# This is only required if the host is not localhost (then there may be no key necessary) +if ! [[ "${DATAVERSE_URL}" == *"://localhost"* ]] || [ -n "${ADMIN_API_UNBLOCK_KEY}" ]; then + # The argument should not be empty + if [ -z "${ADMIN_API_UNBLOCK_KEY}" ]; then + error "You must provide the Dataverse API Unblock Key to this script." + # In case it's not empty, check if it's a file path and read the key from there + elif [ -f "${ADMIN_API_UNBLOCK_KEY}" ] && [ -r "${ADMIN_API_UNBLOCK_KEY}" ]; then + echo "Reading Dataverse API Unblock Key from ${ADMIN_API_UNBLOCK_KEY}." + if ! API_KEY_FILE_CONTENT=$(cat "${ADMIN_API_UNBLOCK_KEY}" 2>/dev/null); then + error "Could not read unblock key from file ${ADMIN_API_UNBLOCK_KEY}." + fi + # Validate the key is not empty + if [ -z "${API_KEY_FILE_CONTENT}" ]; then + error "API key file ${ADMIN_API_UNBLOCK_KEY} appears empty." + fi + ADMIN_API_UNBLOCK_KEY="$API_KEY_CONTENT" + fi + # Very basic error check (as there is no clear format or formal spec for the key) + if [ ${#ADMIN_API_UNBLOCK_KEY} -lt 5 ]; then + error "API key appears to be too short (<5 chars)." + fi + + # Build the header argument for Admin API Authentication via unblock key + AUTH_HEADER_ARG="X-Dataverse-unblock-key: ${ADMIN_API_UNBLOCK_KEY}" +fi + +# Check or wait for Dataverse API being responsive +echo "Waiting for ${DATAVERSE_URL} to become ready in max ${TIMEOUT}." +wait4x http "${DATAVERSE_URL}/api/info/version" -i 8s -t "$TIMEOUT" --expect-status-code 200 --expect-body-json data.version + +# Check for Dataverse Admin API endpoints being reachable by retrieving the current DB options, expect blockades! +CURRENT_SETTINGS=$(mktemp) +echo "Retrieving settings from running instance." +# TODO: Do we need to support pre v6.7 style unblock key query parameter? +curl -sSL --fail-with-body -o "${CURRENT_SETTINGS}" -H "${AUTH_HEADER_ARG}" "${DATAVERSE_URL}/api/admin/settings" \ + || error "Failed. Response message: $( cat "${CURRENT_SETTINGS}")" \ + && echo "Success!" + # TODO: while it's nice to have the current settings written out, it may contain sensitive information (so don't). + # && ( echo "Success! Current settings: "; jq '.data' < "$CURRENT_SETTINGS" ) + +# We need to make the settings update atomic. +echo "Replacing settings." +RESPONSE=$(mktemp) +curl -sSL --fail-with-body -o "${RESPONSE}" -X PUT -H "${AUTH_HEADER_ARG}" --json @"${CONV_CONF_FILE}" "${DATAVERSE_URL}/api/admin/settings" \ + || error "Failed. Response message: $( jq ".message" < "${RESPONSE}" )" \ + && ( echo -e "Success!\nOperations executed: "; jq '.data' < "$RESPONSE" ) diff --git a/modules/container-configbaker/scripts/bootstrap/demo/init.sh b/modules/container-configbaker/scripts/bootstrap/demo/init.sh index aa73cb5edff..b2735b50b28 100644 --- a/modules/container-configbaker/scripts/bootstrap/demo/init.sh +++ b/modules/container-configbaker/scripts/bootstrap/demo/init.sh @@ -31,7 +31,7 @@ fi echo "" echo "Revoke the key that allows for creation of builtin users..." -curl -sS -X DELETE "${DATAVERSE_URL}/api/admin/settings/BuiltinUsers.KEY" +curl -sS -X DELETE "${DATAVERSE_URL}/api/admin/settings/:BuiltinUsersKey" # TODO: stop using these deprecated database settings. See https://github.com/IQSS/dataverse/pull/11454 echo "" diff --git a/modules/container-configbaker/scripts/util/common.sh b/modules/container-configbaker/scripts/util/common.sh new file mode 100644 index 00000000000..91de5257a5c --- /dev/null +++ b/modules/container-configbaker/scripts/util/common.sh @@ -0,0 +1,17 @@ +#!/usr/bin/env bash + +function error { + echo "ERROR:" "$@" >&2 + exit 2 +} + +function exists_on_path { + type "$1" >/dev/null 2>&1 && return 0 + ( IFS=:; for p in $PATH; do [ -x "${p%/}/$1" ] && return 0 || echo "${p%/}/$1"; done; return 1 ) +} + +function require_on_path { + if ! exists_on_path "$1"; then + error "No $1 executable found on PATH." + fi +} diff --git a/modules/container-configbaker/scripts/util/read-to-env.sh b/modules/container-configbaker/scripts/util/read-to-env.sh new file mode 100644 index 00000000000..485586521ab --- /dev/null +++ b/modules/container-configbaker/scripts/util/read-to-env.sh @@ -0,0 +1,34 @@ +#!/usr/bin/env bash + +set -euo pipefail + +source "$(dirname "${BASH_SOURCE[0]}")/common.sh" + +# Read from a target into environment variables. +# Parameters: $target +# Case A) If $target is a file, simply source it. +# Case B) If $target is a directory, parse dirs and files in it as variable names and file content as value +function read_to_env() { + local target="$1" + + if [ -f "$target" ] && [ -r "$target" ]; then + set -o allexport + # shellcheck disable=SC1090 + source "$target" + set +o allexport + elif [ -d "$target" ] && [ -r "$target" ] && [ -x "$target" ]; then + # Find all files (K8s secrets are symlinks, so look for not directory & remove the hidden mounted files.) + FILES=$( find "$target" -not -type d -printf '%P\n' | grep -v '^\.\.' ) + for FILE in $FILES; do + # Same as MPCONFIG does! + VARNAME=$( echo "$FILE" | tr '[:lower:]' '[:upper:]' | tr '/' '_' ) + VARVAL=$( cat "$target/$FILE") + + # Use printf to create the variable in global scope + printf -v "$VARNAME" '%s' "$VARVAL" + export "${VARNAME?}" + done + else + error "'$target' not a (readable) environment file or directory" + fi +} diff --git a/pom.xml b/pom.xml index a9d0bb0b26b..9a199c714e7 100644 --- a/pom.xml +++ b/pom.xml @@ -21,6 +21,10 @@ false false integration + + + -Ddummy.jacoco.property=true + -Ddummy.jacoco.property=true @@ -1036,7 +1040,13 @@ ${testsToExclude} ${skipUnitTests} - ${surefire.jacoco.args} ${argLine} + + @{surefire.jacoco.args} ${argLine} **/builtin-users-spi/** @@ -1048,7 +1058,13 @@ maven-failsafe-plugin ${it.groups} - ${failsafe.jacoco.args} ${argLine} + + @{failsafe.jacoco.args} ${argLine} ${skipIntegrationTests} diff --git a/scripts/api/post-install-api-block.sh b/scripts/api/post-install-api-block.sh index 4cc0ac783f7..f7753665b5b 100755 --- a/scripts/api/post-install-api-block.sh +++ b/scripts/api/post-install-api-block.sh @@ -4,7 +4,7 @@ # the sensitive API endpoints, in order to block it for the general public. # First, revoke the authentication token from the built-in user: -curl -X DELETE $SERVER/admin/settings/BuiltinUsers.KEY +curl -X DELETE "$SERVER/admin/settings/:BuiltinUsersKey" # Block the sensitive endpoints: # Relevant settings: diff --git a/scripts/api/setup-all.sh b/scripts/api/setup-all.sh index b7f962209e4..bd0bd77c52b 100755 --- a/scripts/api/setup-all.sh +++ b/scripts/api/setup-all.sh @@ -57,7 +57,7 @@ echo "- Allow internal signup" curl -X PUT -d yes "${DATAVERSE_URL}/api/admin/settings/:AllowSignUp" curl -X PUT -d "/dataverseuser.xhtml?editMode=CREATE" "${DATAVERSE_URL}/api/admin/settings/:SignUpUrl" -curl -X PUT -d burrito "${DATAVERSE_URL}/api/admin/settings/BuiltinUsers.KEY" +curl -X PUT -d burrito "${DATAVERSE_URL}/api/admin/settings/:BuiltinUsersKey" curl -X PUT -d localhost-only "${DATAVERSE_URL}/api/admin/settings/:BlockedApiPolicy" curl -X PUT -d 'native/http' "${DATAVERSE_URL}/api/admin/settings/:UploadMethods" echo @@ -91,7 +91,7 @@ if [ $SECURESETUP = 1 ] then # Revoke the "burrito" super-key; # Block sensitive API endpoints; - curl -X DELETE "${DATAVERSE_URL}/api/admin/settings/BuiltinUsers.KEY" + curl -X DELETE "${DATAVERSE_URL}/api/admin/settings/:BuiltinUsersKey" curl -X PUT -d 'admin,builtin-users' "${DATAVERSE_URL}/api/admin/settings/:BlockedApiEndpoints" echo "Access to the /api/admin and /api/test is now disabled, except for connections from localhost." else diff --git a/scripts/api/setup-users.sh b/scripts/api/setup-users.sh index 141e1b3150f..7df771dc0fe 100755 --- a/scripts/api/setup-users.sh +++ b/scripts/api/setup-users.sh @@ -5,7 +5,7 @@ SERVER=http://localhost:8080/api echo Setting up users on $SERVER echo ============================================== -curl -X PUT -d burrito $SERVER/admin/settings/BuiltinUsers.KEY +curl -X PUT -d burrito "$SERVER/admin/settings/:BuiltinUsersKey" peteResp=$(curl -s -H "Content-type:application/json" -X POST -d @data/userPete.json "$SERVER/builtin-users?password=pete&key=burrito") diff --git a/scripts/issues/2454/run-test.sh b/scripts/issues/2454/run-test.sh index 49eb45a8a5e..5ae0ac33f4d 100755 --- a/scripts/issues/2454/run-test.sh +++ b/scripts/issues/2454/run-test.sh @@ -39,7 +39,7 @@ if [ $SETUP_NEEDED == "yes" ]; then echo $ROOT_USER api key is $ROOT_KEY # Create @anAuthUser - USER_CREATION_KEY=$($DB "SELECT content FROM setting WHERE name='BuiltinUsers.KEY'") + USER_CREATION_KEY=$($DB "SELECT content FROM setting WHERE name=':BuiltinUsersKey'") AN_AUTH_USER_KEY=$( curl -s -X POST -d@anAuthUser.json -H"Content-type:application/json" $ENDPOINT/builtin-users?password=XXX\&key=$USER_CREATION_KEY | jq .data.apiToken | tr -d \") ANOTHER_AUTH_USER_KEY=$( curl -s -X POST -d@anotherAuthUser.json -H"Content-type:application/json" $ENDPOINT/builtin-users?password=XXX\&key=$USER_CREATION_KEY | jq .data.apiToken | tr -d \") echo diff --git a/src/main/java/edu/harvard/iq/dataverse/DataFileCategoryServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/DataFileCategoryServiceBean.java index 29dcb22c3ec..d29b5670952 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DataFileCategoryServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/DataFileCategoryServiceBean.java @@ -1,6 +1,7 @@ package edu.harvard.iq.dataverse; import edu.harvard.iq.dataverse.settings.SettingsServiceBean; +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.FileCategories; import edu.harvard.iq.dataverse.util.BundleUtil; import jakarta.ejb.EJB; @@ -21,7 +22,7 @@ @Stateless public class DataFileCategoryServiceBean { - public static final String FILE_CATEGORIES_KEY = ":FileCategories"; + public static final String FILE_CATEGORIES_KEY = FileCategories.toString(); @EJB private SettingsServiceBean settingsService; diff --git a/src/main/java/edu/harvard/iq/dataverse/DatasetFieldServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/DatasetFieldServiceBean.java index ebee7c20ba2..e6b2711b443 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DatasetFieldServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/DatasetFieldServiceBean.java @@ -53,6 +53,7 @@ import org.apache.http.protocol.HttpContext; import org.apache.http.util.EntityUtils; import edu.harvard.iq.dataverse.settings.SettingsServiceBean; +import edu.harvard.iq.dataverse.util.ListSplitUtil; /** * @@ -908,12 +909,12 @@ public String getFieldLanguage(String languages, String localeCode) { // If the fields list of supported languages contains the current locale (e.g. // the lang of the UI, or the current metadata input/display lang (tbd)), use // that. Otherwise, return the first in the list - String[] langStrings = languages.split("\\s*,\\s*"); - if (langStrings.length > 0) { - if (Arrays.asList(langStrings).contains(localeCode)) { + final List langStrings = ListSplitUtil.split(languages); + if (!langStrings.isEmpty()) { + if (langStrings.contains(localeCode)) { return localeCode; } else { - return langStrings[0]; + return langStrings.get(0); } } return null; diff --git a/src/main/java/edu/harvard/iq/dataverse/EditDataFilesPageHelper.java b/src/main/java/edu/harvard/iq/dataverse/EditDataFilesPageHelper.java index 883baeedef4..7b5c3aa0857 100644 --- a/src/main/java/edu/harvard/iq/dataverse/EditDataFilesPageHelper.java +++ b/src/main/java/edu/harvard/iq/dataverse/EditDataFilesPageHelper.java @@ -1,5 +1,6 @@ package edu.harvard.iq.dataverse; +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.CreateDataFilesMaxErrorsToDisplay; import edu.harvard.iq.dataverse.util.BundleUtil; import edu.harvard.iq.dataverse.util.file.CreateDataFileResult; import org.apache.commons.text.StringEscapeUtils; @@ -18,7 +19,7 @@ @Stateless public class EditDataFilesPageHelper { - public static final String MAX_ERRORS_TO_DISPLAY_SETTING = ":CreateDataFilesMaxErrorsToDisplay"; + public static final String MAX_ERRORS_TO_DISPLAY_SETTING = CreateDataFilesMaxErrorsToDisplay.toString(); public static final Integer MAX_ERRORS_TO_DISPLAY = 5; @Inject diff --git a/src/main/java/edu/harvard/iq/dataverse/EditDatafilesPage.java b/src/main/java/edu/harvard/iq/dataverse/EditDatafilesPage.java index 3fa1c8b2c10..fd0f3be9871 100644 --- a/src/main/java/edu/harvard/iq/dataverse/EditDatafilesPage.java +++ b/src/main/java/edu/harvard/iq/dataverse/EditDatafilesPage.java @@ -81,6 +81,8 @@ import java.util.Collection; import java.util.Set; import java.util.logging.Level; +import java.util.stream.Collectors; + import jakarta.faces.event.AjaxBehaviorEvent; import jakarta.faces.event.FacesEvent; import jakarta.servlet.ServletOutputStream; @@ -381,19 +383,11 @@ public String getHumanPerFormatTabularLimits() { } public String populateHumanPerFormatTabularLimits() { - String keyPrefix = ":TabularIngestSizeLimit:"; - List formatLimits = new ArrayList<>(); - for (Setting setting : settingsService.listAll()) { - String name = setting.getName(); - if (!name.startsWith(keyPrefix)) { - continue; - } - String tabularName = setting.getName().substring(keyPrefix.length()); - String bytes = setting.getContent(); - String humanReadableSize = FileSizeChecker.bytesToHumanReadable(Long.valueOf(bytes)); - formatLimits.add(tabularName + ": " + humanReadableSize); - } - return String.join(", ", formatLimits); + return systemConfig.getTabularIngestSizeLimits().entrySet().stream() + // The human-readable list shall not contain the setting for non-matching formats + .filter(entry -> ! entry.getKey().equals(SystemConfig.TABULAR_INGEST_SIZE_LIMITS_DEFAULT_KEY)) + .map(entry -> entry.getKey() + ": " + FileSizeChecker.bytesToHumanReadable(entry.getValue())) + .collect(Collectors.joining(", ")); } public Integer getFileUploadsAvailable() { diff --git a/src/main/java/edu/harvard/iq/dataverse/FileMetadata.java b/src/main/java/edu/harvard/iq/dataverse/FileMetadata.java index 932bbd60be6..b60b5afedd3 100644 --- a/src/main/java/edu/harvard/iq/dataverse/FileMetadata.java +++ b/src/main/java/edu/harvard/iq/dataverse/FileMetadata.java @@ -49,6 +49,7 @@ import edu.harvard.iq.dataverse.datavariable.VarGroup; import edu.harvard.iq.dataverse.datavariable.VariableMetadata; import edu.harvard.iq.dataverse.util.DateUtil; +import edu.harvard.iq.dataverse.util.ListSplitUtil; import edu.harvard.iq.dataverse.util.StringUtil; import java.util.HashSet; import java.util.Set; @@ -605,18 +606,18 @@ public int compare(FileMetadata o1, FileMetadata o2) { } }; - static Map categoryMap=null; + static Map categoryMap = null; public static void setCategorySortOrder(String categories) { - categoryMap=new HashMap(); - long i=1; - for(String cat: categories.split(",\\s*")) { - categoryMap.put(cat.toUpperCase(), i); - i++; - } + categoryMap = new HashMap(); + long i = 1; + for (String cat : ListSplitUtil.split(categories)) { + categoryMap.put(cat.toUpperCase(), i); + i++; + } } - public static Map getCategorySortOrder() { + public static Map getCategorySortOrder() { return categoryMap; } diff --git a/src/main/java/edu/harvard/iq/dataverse/SettingsWrapper.java b/src/main/java/edu/harvard/iq/dataverse/SettingsWrapper.java index 69f3123e7e1..23a26a8cf2c 100644 --- a/src/main/java/edu/harvard/iq/dataverse/SettingsWrapper.java +++ b/src/main/java/edu/harvard/iq/dataverse/SettingsWrapper.java @@ -14,6 +14,7 @@ import edu.harvard.iq.dataverse.settings.SettingsServiceBean; import edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key; import edu.harvard.iq.dataverse.util.BundleUtil; +import edu.harvard.iq.dataverse.util.ListSplitUtil; import edu.harvard.iq.dataverse.util.StringUtil; import edu.harvard.iq.dataverse.util.SystemConfig; import edu.harvard.iq.dataverse.UserNotification.Type; @@ -50,8 +51,7 @@ public class SettingsWrapper implements java.io.Serializable { static final Logger logger = Logger.getLogger(SettingsWrapper.class.getCanonicalName()); - public static final String COMMA_BETWEEN_OPTIONAL_WHITE_SPACE = "\\s*,\\s*"; - + @EJB SettingsServiceBean settingsService; @@ -218,7 +218,7 @@ public Integer getInteger(String settingKey, Integer defaultValue) { private void initSettingsMap() { // initialize settings map settingsMap = new HashMap<>(); - for (Setting setting : settingsService.listAll()) { + for (Setting setting : settingsService.listAllWithoutLocalizations()) { settingsMap.put(setting.getName(), setting.getContent()); } } @@ -393,10 +393,12 @@ public boolean isRsyncOnly() { rsyncOnly = false; } else { String uploadMethods = getValueForKey(SettingsServiceBean.Key.UploadMethods); - if (uploadMethods==null){ + if (uploadMethods == null) { rsyncOnly = false; } else { - rsyncOnly = Arrays.asList(uploadMethods.toLowerCase().split(COMMA_BETWEEN_OPTIONAL_WHITE_SPACE)).size() == 1 && uploadMethods.toLowerCase().equals(SystemConfig.FileUploadMethods.RSYNC.toString()); + String normalizedUploadMethods = uploadMethods.toLowerCase(); + rsyncOnly = ListSplitUtil.split(normalizedUploadMethods).size() == 1 + && normalizedUploadMethods.equals(SystemConfig.FileUploadMethods.RSYNC.toString()); } } } @@ -424,11 +426,11 @@ public String getSupportTeamEmail() { public Integer getUploadMethodsCount() { if (uploadMethodsCount == null) { - String uploadMethods = getValueForKey(SettingsServiceBean.Key.UploadMethods); - if (uploadMethods==null){ + String uploadMethods = getValueForKey(SettingsServiceBean.Key.UploadMethods); + if (uploadMethods == null) { uploadMethodsCount = 0; } else { - uploadMethodsCount = Arrays.asList(uploadMethods.toLowerCase().split(COMMA_BETWEEN_OPTIONAL_WHITE_SPACE)).size(); + uploadMethodsCount = ListSplitUtil.split(uploadMethods).size(); } } return uploadMethodsCount; @@ -502,7 +504,7 @@ public boolean shouldBeAnonymized(DatasetField df) { if (anonymizedFieldTypes == null) { anonymizedFieldTypes = new ArrayList(); String names = get(SettingsServiceBean.Key.AnonymizedFieldTypeNames.toString(), ""); - anonymizedFieldTypes.addAll(Arrays.asList(names.split(COMMA_BETWEEN_OPTIONAL_WHITE_SPACE))); + anonymizedFieldTypes.addAll(ListSplitUtil.split(names)); } return anonymizedFieldTypes.contains(df.getDatasetFieldType().getName()); } @@ -826,11 +828,11 @@ public String getMetricsUrl() { } private Boolean getUploadMethodAvailable(String method){ - String uploadMethods = getValueForKey(SettingsServiceBean.Key.UploadMethods); - if (uploadMethods==null){ + String uploadMethods = getValueForKey(SettingsServiceBean.Key.UploadMethods); + if (uploadMethods == null) { return false; } else { - return Arrays.asList(uploadMethods.toLowerCase().split(COMMA_BETWEEN_OPTIONAL_WHITE_SPACE)).contains(method); + return ListSplitUtil.splitToLowerCaseSet(uploadMethods).contains(method); } } diff --git a/src/main/java/edu/harvard/iq/dataverse/api/Admin.java b/src/main/java/edu/harvard/iq/dataverse/api/Admin.java index 3df78648433..75aedb038dc 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/Admin.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/Admin.java @@ -18,6 +18,7 @@ import edu.harvard.iq.dataverse.FileMetadata; import edu.harvard.iq.dataverse.api.auth.AuthRequired; import edu.harvard.iq.dataverse.settings.JvmSettings; +import edu.harvard.iq.dataverse.settings.SettingsValidationException; import edu.harvard.iq.dataverse.util.StringUtil; import edu.harvard.iq.dataverse.util.cache.CacheFactoryBean; import edu.harvard.iq.dataverse.util.json.JsonPrinter; @@ -64,6 +65,7 @@ import jakarta.ws.rs.PathParam; import jakarta.ws.rs.container.ContainerRequestContext; import jakarta.ws.rs.core.Context; +import jakarta.ws.rs.core.MediaType; import jakarta.ws.rs.core.Response; import static edu.harvard.iq.dataverse.util.json.NullSafeJsonBuilder.jsonObjectBuilder; @@ -113,6 +115,7 @@ import edu.harvard.iq.dataverse.util.ArchiverUtil; import edu.harvard.iq.dataverse.util.BundleUtil; import edu.harvard.iq.dataverse.util.FileUtil; +import edu.harvard.iq.dataverse.util.ListSplitUtil; import edu.harvard.iq.dataverse.util.SystemConfig; import edu.harvard.iq.dataverse.util.URLTokenUtil; import edu.harvard.iq.dataverse.util.UrlSignerUtil; @@ -133,6 +136,11 @@ import jakarta.ws.rs.QueryParam; import jakarta.ws.rs.WebApplicationException; import jakarta.ws.rs.core.StreamingOutput; +import org.eclipse.microprofile.openapi.annotations.media.Content; +import org.eclipse.microprofile.openapi.annotations.media.Schema; +import org.eclipse.microprofile.openapi.annotations.responses.APIResponse; +import org.eclipse.microprofile.openapi.annotations.responses.APIResponses; + import java.nio.file.Paths; import java.util.TreeMap; @@ -194,50 +202,119 @@ public class Admin extends AbstractApiBean { public static final String listUsersPartialAPIPath = "list-users"; public static final String listUsersFullAPIPath = "/api/admin/" + listUsersPartialAPIPath; - + @Path("settings") @GET + @APIResponses({ + @APIResponse(responseCode = "200", + description = "All database options successfully queried", + // The schema may be extended later to better describe what the JSON object looks like. + content = @Content(schema = @Schema(implementation = JsonObject.class))), + }) public Response listAllSettings() { - JsonObjectBuilder bld = jsonObjectBuilder(); - settingsSvc.listAll().forEach(s -> bld.add(s.getName(), s.getContent())); - return ok(bld); + return ok(settingsSvc.listAllAsJson()); } - + + @Path("settings") + @PUT + @Consumes(MediaType.APPLICATION_JSON) + @APIResponses({ + @APIResponse(responseCode = "200", description = "All database options successfully updated") + }) + public Response putAllSettings(JsonObject settings) { + try { + // Basic JSON structure validation only + if (settings == null || settings.isEmpty()) { + return error(Response.Status.BAD_REQUEST, "Empty or invalid JSON object"); + } + + // Transfer to domain objects and deeper validation to be handled by the service layer. + JsonObjectBuilder successfullOperations = settingsSvc.setAllFromJson(settings); + return ok("All database options successfully updated.", successfullOperations); + } catch (SettingsValidationException sve) { + return error(Response.Status.BAD_REQUEST, sve.getMessage()); + } + } + @Path("settings/{name}") @PUT public Response putSetting(@PathParam("name") String name, String content) { - Setting s = settingsSvc.set(name, content); - return ok(jsonObjectBuilder().add(s.getName(), s.getContent())); + try { + SettingsServiceBean.validateSettingName(name); + + Setting s = settingsSvc.set(name, content); + return ok("Setting " + name + " added."); + } catch (SettingsValidationException sve) { + return error(Response.Status.BAD_REQUEST, sve.getMessage()); + } } @Path("settings/{name}/lang/{lang}") @PUT public Response putSettingLang(@PathParam("name") String name, @PathParam("lang") String lang, String content) { - Setting s = settingsSvc.set(name, lang, content); - return ok("Setting " + name + " - " + lang + " - added."); + try { + SettingsServiceBean.validateSettingName(name); + SettingsServiceBean.validateSettingLang(lang); + + Setting s = settingsSvc.set(name, lang, content); + return ok("Setting " + name + " added for language " + lang + "."); + } catch (SettingsValidationException sve) { + return error(Response.Status.BAD_REQUEST, sve.getMessage()); + } } @Path("settings/{name}") @GET public Response getSetting(@PathParam("name") String name) { - String s = settingsSvc.get(name); - - return (s != null) ? ok(s) : notFound("Setting " + name + " not found"); + try { + SettingsServiceBean.validateSettingName(name); + + String content = settingsSvc.get(name); + return (content != null) ? ok(content) : notFound("Setting " + name + " not found."); + } catch (IllegalArgumentException iae) { + return error(Response.Status.BAD_REQUEST, iae.getMessage()); + } + } + + @Path("settings/{name}/lang/{lang}") + @GET + public Response getSetting(@PathParam("name") String name, @PathParam("lang") String lang) { + try { + SettingsServiceBean.validateSettingName(name); + SettingsServiceBean.validateSettingLang(lang); + + String content = settingsSvc.get(name, lang, null); + return (content != null) ? ok(content) : notFound("Setting " + name + " for language " + lang + " not found."); + } catch (SettingsValidationException sve) { + return error(Response.Status.BAD_REQUEST, sve.getMessage()); + } } @Path("settings/{name}") @DELETE public Response deleteSetting(@PathParam("name") String name) { - settingsSvc.delete(name); - - return ok("Setting " + name + " deleted."); + try { + SettingsServiceBean.validateSettingName(name); + + settingsSvc.delete(name); + return ok("Setting " + name + " deleted."); + } catch (SettingsValidationException sve) { + return error(Response.Status.BAD_REQUEST, sve.getMessage()); + } } @Path("settings/{name}/lang/{lang}") @DELETE public Response deleteSettingLang(@PathParam("name") String name, @PathParam("lang") String lang) { - settingsSvc.delete(name, lang); - return ok("Setting " + name + " - " + lang + " deleted."); + try { + SettingsServiceBean.validateSettingName(name); + SettingsServiceBean.validateSettingLang(lang); + + settingsSvc.delete(name, lang); + return ok("Setting " + name + " for language " + lang + " deleted."); + } catch (SettingsValidationException sve) { + return error(Response.Status.BAD_REQUEST, sve.getMessage()); + } } @Path("template/{id}") @@ -2167,7 +2244,7 @@ public Response addRoleAssignementsToChildren(@Context ContainerRequestContext c boolean inheritAllRoles = false; String rolesString = settingsSvc.getValueForKey(SettingsServiceBean.Key.InheritParentRoleAssignments, ""); if (rolesString.length() > 0) { - ArrayList rolesToInherit = new ArrayList(Arrays.asList(rolesString.split("\\s*,\\s*"))); + ArrayList rolesToInherit = new ArrayList<>(ListSplitUtil.split(rolesString)); if (!rolesToInherit.isEmpty()) { if (rolesToInherit.contains("*")) { inheritAllRoles = true; diff --git a/src/main/java/edu/harvard/iq/dataverse/api/BuiltinUsers.java b/src/main/java/edu/harvard/iq/dataverse/api/BuiltinUsers.java index 317f7d6c870..79d5682d4f3 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/BuiltinUsers.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/BuiltinUsers.java @@ -40,8 +40,6 @@ public class BuiltinUsers extends AbstractApiBean { private static final Logger logger = Logger.getLogger(BuiltinUsers.class.getName()); - private static final String API_KEY_IN_SETTINGS = "BuiltinUsers.KEY"; - @EJB protected BuiltinUserServiceBean builtinUserSvc; @@ -129,7 +127,7 @@ private Response internalSave(BuiltinUser user, String password, String key) { } private Response internalSave(BuiltinUser user, String password, String key, Boolean sendEmailNotification) { - String expectedKey = settingsSvc.get(API_KEY_IN_SETTINGS); + String expectedKey = settingsSvc.getValueForKey(SettingsServiceBean.Key.BuiltinUsersKey); if (expectedKey == null) { return error(Status.SERVICE_UNAVAILABLE, "Dataverse config issue: No API key defined for built in user management"); diff --git a/src/main/java/edu/harvard/iq/dataverse/api/Datasets.java b/src/main/java/edu/harvard/iq/dataverse/api/Datasets.java index df292762353..4b3db65556c 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/Datasets.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/Datasets.java @@ -5317,7 +5317,8 @@ public Response getPrivateUrlDatasetVersion(@PathParam("privateUrlToken") String } JsonObjectBuilder responseJson; if (isAnonymizedAccess) { - List anonymizedFieldTypeNamesList = new ArrayList<>(Arrays.asList(anonymizedFieldTypeNames.split(SettingsWrapper.COMMA_BETWEEN_OPTIONAL_WHITE_SPACE))); + // Use ListSplitUtil for consistent CSV parsing + List anonymizedFieldTypeNamesList = new ArrayList<>(ListSplitUtil.split(anonymizedFieldTypeNames)); responseJson = json(dsv, anonymizedFieldTypeNamesList, true, returnOwners); } else { responseJson = json(dsv, null, true, returnOwners); @@ -5343,7 +5344,8 @@ public Response getPreviewUrlDatasetVersion(@PathParam("previewUrlToken") String } JsonObjectBuilder responseJson; if (isAnonymizedAccess) { - List anonymizedFieldTypeNamesList = new ArrayList<>(Arrays.asList(anonymizedFieldTypeNames.split(SettingsWrapper.COMMA_BETWEEN_OPTIONAL_WHITE_SPACE))); + // Use ListSplitUtil for consistent CSV parsing + List anonymizedFieldTypeNamesList = new ArrayList<>(ListSplitUtil.split(anonymizedFieldTypeNames)); responseJson = json(dsv, anonymizedFieldTypeNamesList, true, returnOwners); } else { responseJson = json(dsv, null, true, returnOwners); diff --git a/src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java b/src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java index a5a336e1c9f..29bac86e658 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/Dataverses.java @@ -2001,6 +2001,34 @@ public Response createTemplate(@Context ContainerRequestContext crc, String body return e.getResponse(); } } + + @GET + @AuthRequired + @Path("{identifier}/allowedMetadataLanguages") + public Response getMetadataLanguage(@Context ContainerRequestContext crc, @PathParam("identifier") String dvIdtf) { + return response(req -> { + Dataverse dataverse = findDataverseOrDie(dvIdtf); + return ok(jsonLanguage(execCommand( + new GetDataverseMetadataLanguageCommand(req, dataverse)))); + }, getRequestUser(crc)); + } + + @PUT + @AuthRequired + @Path("{identifier}/allowedMetadataLanguages/{metadataLanguage}") + public Response setMetadataLanguage(@Context ContainerRequestContext crc, @PathParam("identifier") String dvIdtf, @PathParam("metadataLanguage") String lang) { + return response(req -> { + Map langMap = settingsService.getBaseMetadataLanguageMap(null, true); + if (langMap.isEmpty()) { + return badRequest("There are no metadata languages configured on this server"); + } + if (!langMap.containsKey(lang)) { + return badRequest("The specified metadata language " + lang + " is not allowed on this server!"); + } + Dataverse dataverse = findDataverseOrDie(dvIdtf); + return ok(jsonLanguage(execCommand(new SetDataverseMetadataLanguageCommand(req, dataverse, lang)))); + }, getRequestUser(crc)); + } @GET @AuthRequired diff --git a/src/main/java/edu/harvard/iq/dataverse/api/Workflows.java b/src/main/java/edu/harvard/iq/dataverse/api/Workflows.java index 4eadcedf71a..7bd19b3a403 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/Workflows.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/Workflows.java @@ -3,6 +3,8 @@ import edu.harvard.iq.dataverse.authorization.groups.impl.ipaddress.IpGroup; import edu.harvard.iq.dataverse.authorization.groups.impl.ipaddress.ip.IpAddress; import edu.harvard.iq.dataverse.authorization.groups.impl.ipaddress.ip.IpAddressRange; +import edu.harvard.iq.dataverse.settings.SettingsServiceBean; +import edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key; import edu.harvard.iq.dataverse.workflow.PendingWorkflowInvocation; import edu.harvard.iq.dataverse.workflow.WorkflowServiceBean; import java.util.Arrays; @@ -60,7 +62,7 @@ private boolean isAllowed(IpAddress addr) { private void updateWhitelist() { IpGroup updatedList = new IpGroup(); - String[] ips = settingsSvc.get(WorkflowsAdmin.IP_WHITELIST_KEY, "127.0.0.1;::1").split(";"); + String[] ips = settingsSvc.getValueForKey(Key.WorkflowsAdminIpWhitelist, WorkflowsAdmin.DEFAULT_IP_ALLOWLIST).split(WorkflowsAdmin.IP_SEPARATOR); Arrays.stream(ips) .forEach( str -> updatedList.add( IpAddressRange.makeSingle( diff --git a/src/main/java/edu/harvard/iq/dataverse/api/WorkflowsAdmin.java b/src/main/java/edu/harvard/iq/dataverse/api/WorkflowsAdmin.java index 15478aacff7..ecb7248cae9 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/WorkflowsAdmin.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/WorkflowsAdmin.java @@ -3,6 +3,8 @@ import edu.harvard.iq.dataverse.authorization.groups.impl.ipaddress.ip.IpAddress; import edu.harvard.iq.dataverse.util.json.JsonParseException; import edu.harvard.iq.dataverse.util.json.JsonParser; + +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.WorkflowsAdminIpWhitelist; import static edu.harvard.iq.dataverse.util.json.JsonPrinter.brief; import static edu.harvard.iq.dataverse.util.json.JsonPrinter.json; import static edu.harvard.iq.dataverse.util.json.JsonPrinter.toJsonArray; @@ -30,8 +32,9 @@ */ @Path("admin/workflows") public class WorkflowsAdmin extends AbstractApiBean { - - public static final String IP_WHITELIST_KEY="WorkflowsAdmin#IP_WHITELIST_KEY"; + + public static final String IP_SEPARATOR = ";"; + public static final String DEFAULT_IP_ALLOWLIST = "127.0.0.1" + IP_SEPARATOR + "::1"; @EJB WorkflowServiceBean workflows; @@ -153,14 +156,14 @@ public Response deleteWorkflow(@PathParam("id") String id ) { @Path("/ip-whitelist") @GET public Response getIpWhitelist() { - return ok( settingsSvc.get(IP_WHITELIST_KEY, "127.0.0.1;::1") ); + return ok( settingsSvc.getValueForKey(WorkflowsAdminIpWhitelist, DEFAULT_IP_ALLOWLIST) ); } @Path("/ip-whitelist") @PUT public Response setIpWhitelist(String body) { String ipList = body.trim(); - String[] ips = ipList.split(";"); + String[] ips = ipList.split(IP_SEPARATOR); boolean allIpsOk = Arrays.stream(ips).allMatch(ip->{ try { IpAddress.valueOf(ip); @@ -170,18 +173,17 @@ public Response setIpWhitelist(String body) { } } ); if (allIpsOk) { - settingsSvc.set(IP_WHITELIST_KEY, ipList); - return ok( settingsSvc.get(IP_WHITELIST_KEY, "127.0.0.1;::1") ); + settingsSvc.setValueForKey(WorkflowsAdminIpWhitelist, ipList); + return ok( settingsSvc.getValueForKey(WorkflowsAdminIpWhitelist, DEFAULT_IP_ALLOWLIST) ); } else { return badRequest("Request contains illegal IP addresses."); } - } @Path("/ip-whitelist") @DELETE public Response deleteIpWhitelist() { - settingsSvc.delete(IP_WHITELIST_KEY); + settingsSvc.deleteValueForKey(WorkflowsAdminIpWhitelist); return ok( "Restored whitelist to default (127.0.0.1;::1)" ); } diff --git a/src/main/java/edu/harvard/iq/dataverse/api/util/JsonResponseBuilder.java b/src/main/java/edu/harvard/iq/dataverse/api/util/JsonResponseBuilder.java index a80d54508fd..9095a40c608 100644 --- a/src/main/java/edu/harvard/iq/dataverse/api/util/JsonResponseBuilder.java +++ b/src/main/java/edu/harvard/iq/dataverse/api/util/JsonResponseBuilder.java @@ -131,7 +131,7 @@ public JsonResponseBuilder requestContentType(HttpServletRequest request) { * @return The enhanced builder */ public JsonResponseBuilder internalError(Throwable ex) { - this.entityBuilder.add("interalError", ex.getClass().getSimpleName()); + this.entityBuilder.add("internalError", ex.getClass().getSimpleName()); if (ex.getCause() != null) { this.entityBuilder.add("internalCause", ex.getCause().getClass().getSimpleName()); } diff --git a/src/main/java/edu/harvard/iq/dataverse/batch/jobs/importer/filesystem/FileRecordReader.java b/src/main/java/edu/harvard/iq/dataverse/batch/jobs/importer/filesystem/FileRecordReader.java index 9ce30683a87..175683bbb16 100644 --- a/src/main/java/edu/harvard/iq/dataverse/batch/jobs/importer/filesystem/FileRecordReader.java +++ b/src/main/java/edu/harvard/iq/dataverse/batch/jobs/importer/filesystem/FileRecordReader.java @@ -25,6 +25,7 @@ import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser; import edu.harvard.iq.dataverse.batch.jobs.importer.ImportMode; import edu.harvard.iq.dataverse.settings.JvmSettings; +import edu.harvard.iq.dataverse.util.ListSplitUtil; import org.apache.commons.io.filefilter.NotFileFilter; import org.apache.commons.io.filefilter.WildcardFileFilter; @@ -43,7 +44,6 @@ import java.io.FileFilter; import java.io.Serializable; import java.util.ArrayList; -import java.util.Arrays; import java.util.HashMap; import java.util.Iterator; import java.util.List; @@ -152,8 +152,13 @@ public File readItem() { * @return list of files */ private List getFiles(final File directory) { - // create filter from job xml excludes property - FileFilter excludeFilter = new NotFileFilter(new WildcardFileFilter(Arrays.asList(excludes.split("\\s*,\\s*")))); + // create filter from job xml excludes property using builder to avoid deprecated constructors + final String[] excludedPatterns = ListSplitUtil.split(excludes).toArray(new String[0]); + FileFilter excludeFilter = new NotFileFilter( + WildcardFileFilter.builder() + .setWildcards(excludedPatterns) + .get() + ); List files = new ArrayList<>(); File[] filesList = directory.listFiles(excludeFilter); if (filesList != null) { diff --git a/src/main/java/edu/harvard/iq/dataverse/dataaccess/GlobusAccessibleStore.java b/src/main/java/edu/harvard/iq/dataverse/dataaccess/GlobusAccessibleStore.java index 8bed60d8302..032ec1cfe48 100644 --- a/src/main/java/edu/harvard/iq/dataverse/dataaccess/GlobusAccessibleStore.java +++ b/src/main/java/edu/harvard/iq/dataverse/dataaccess/GlobusAccessibleStore.java @@ -1,5 +1,6 @@ package edu.harvard.iq.dataverse.dataaccess; +import edu.harvard.iq.dataverse.util.ListSplitUtil; import jakarta.json.Json; import jakarta.json.JsonArray; import jakarta.json.JsonArrayBuilder; @@ -38,10 +39,10 @@ public static String getTransferPath(String driverId) { } public static JsonArray getReferenceEndpointsWithPaths(String driverId) { - String[] endpoints = StorageIO.getConfigParamForDriver(driverId, AbstractRemoteOverlayAccessIO.REFERENCE_ENDPOINTS_WITH_BASEPATHS).split("\\s*,\\s*"); JsonArrayBuilder builder = Json.createArrayBuilder(); - for(int i=0;i allowedEndpoints = ListSplitUtil.split(rawEndpoints); + if (allowedEndpoints.isEmpty()) { + throw new IOException("dataverse.files." + driverId + ".base-url is required"); } - return allowedEndpoints; + return allowedEndpoints.toArray(new String[0]); } diff --git a/src/main/java/edu/harvard/iq/dataverse/dataaccess/RemoteOverlayAccessIO.java b/src/main/java/edu/harvard/iq/dataverse/dataaccess/RemoteOverlayAccessIO.java index bca70259cb7..1613d1ec7cc 100644 --- a/src/main/java/edu/harvard/iq/dataverse/dataaccess/RemoteOverlayAccessIO.java +++ b/src/main/java/edu/harvard/iq/dataverse/dataaccess/RemoteOverlayAccessIO.java @@ -5,6 +5,7 @@ import edu.harvard.iq.dataverse.Dataverse; import edu.harvard.iq.dataverse.DvObject; import edu.harvard.iq.dataverse.datavariable.DataVariable; +import edu.harvard.iq.dataverse.util.ListSplitUtil; import edu.harvard.iq.dataverse.util.UrlSignerUtil; import java.io.FileNotFoundException; @@ -33,10 +34,10 @@ */ /* * Remote Overlay Driver - * + * * StorageIdentifier format: * ://// - * + * * baseUrl: http(s):// */ public class RemoteOverlayAccessIO extends AbstractRemoteOverlayAccessIO { @@ -48,7 +49,7 @@ public class RemoteOverlayAccessIO extends AbstractRemoteOve public RemoteOverlayAccessIO() { super(); } - + public RemoteOverlayAccessIO(T dvObject, DataAccessRequest req, String driverId) throws IOException { super(dvObject, req, driverId); this.setIsLocalFile(false); @@ -124,10 +125,10 @@ public void open(DataAccessOption... options) throws IOException { logger.fine("Setting size"); this.setSize(retrieveSizeFromMedia()); } - if (dataFile.getContentType() != null + if (dataFile.getContentType() != null && dataFile.getContentType().equals("text/tab-separated-values") - && dataFile.isTabularData() - && dataFile.getDataTable() != null + && dataFile.isTabularData() + && dataFile.getDataTable() != null && (!this.noVarHeader()) && (!dataFile.getDataTable().isStoredWithVariableHeader())) { @@ -317,7 +318,7 @@ protected void configureRemoteEndpoints() throws IOException { baseUrl = getConfigParam(BASE_URL); if (baseUrl == null) { //Will accept the first endpoint using the newer setting - baseUrl = getConfigParam(REFERENCE_ENDPOINTS_WITH_BASEPATHS).split("\\s*,\\s*")[0]; + baseUrl = ListSplitUtil.split(getConfigParam(REFERENCE_ENDPOINTS_WITH_BASEPATHS)).stream().findFirst().orElse(baseUrl); if (baseUrl == null) { throw new IOException("dataverse.files." + this.driverId + ".base-url is required"); } diff --git a/src/main/java/edu/harvard/iq/dataverse/datacapturemodule/DataCaptureModuleUtil.java b/src/main/java/edu/harvard/iq/dataverse/datacapturemodule/DataCaptureModuleUtil.java index 094d3976133..de2aa0aaee8 100644 --- a/src/main/java/edu/harvard/iq/dataverse/datacapturemodule/DataCaptureModuleUtil.java +++ b/src/main/java/edu/harvard/iq/dataverse/datacapturemodule/DataCaptureModuleUtil.java @@ -5,8 +5,8 @@ import edu.harvard.iq.dataverse.Dataset; import edu.harvard.iq.dataverse.DatasetVersion; import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser; +import edu.harvard.iq.dataverse.util.ListSplitUtil; import edu.harvard.iq.dataverse.util.SystemConfig; -import java.util.Arrays; import java.util.logging.Logger; import jakarta.json.Json; import jakarta.json.JsonObject; @@ -19,11 +19,11 @@ public class DataCaptureModuleUtil { @Deprecated(forRemoval = true, since = "2024-07-07") public static boolean rsyncSupportEnabled(String uploadMethodsSettings) { - logger.fine("uploadMethodsSettings: " + uploadMethodsSettings);; + logger.fine("uploadMethodsSettings: " + uploadMethodsSettings);; if (uploadMethodsSettings==null){ return false; } else { - return Arrays.asList(uploadMethodsSettings.toLowerCase().split("\\s*,\\s*")).contains(SystemConfig.FileUploadMethods.RSYNC.toString()); + return ListSplitUtil.splitToLowerCaseSet(uploadMethodsSettings).contains(SystemConfig.FileUploadMethods.RSYNC.toString()); } } diff --git a/src/main/java/edu/harvard/iq/dataverse/dataset/DatasetUtil.java b/src/main/java/edu/harvard/iq/dataverse/dataset/DatasetUtil.java index 46f458c5403..2ce5471a523 100644 --- a/src/main/java/edu/harvard/iq/dataverse/dataset/DatasetUtil.java +++ b/src/main/java/edu/harvard/iq/dataverse/dataset/DatasetUtil.java @@ -13,6 +13,7 @@ import edu.harvard.iq.dataverse.dataaccess.ImageThumbConverter; import edu.harvard.iq.dataverse.util.BundleUtil; import edu.harvard.iq.dataverse.util.FileUtil; +import edu.harvard.iq.dataverse.util.ListSplitUtil; import java.awt.image.BufferedImage; import java.io.ByteArrayInputStream; import java.io.File; @@ -531,7 +532,7 @@ public static String[] getDatasetSummaryFieldNames(String customFieldNames) { } else { summaryFieldNames = customFieldNames; } - return summaryFieldNames.split("\\s*,\\s*"); + return ListSplitUtil.split(summaryFieldNames).toArray(new String[0]); } public static boolean isRsyncAppropriateStorageDriver(Dataset dataset){ diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/CreateDataverseCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/CreateDataverseCommand.java index b28302ba861..3071cfaea8f 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/CreateDataverseCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/CreateDataverseCommand.java @@ -12,10 +12,10 @@ import edu.harvard.iq.dataverse.engine.command.exception.IllegalCommandException; import edu.harvard.iq.dataverse.settings.SettingsServiceBean; import edu.harvard.iq.dataverse.util.BundleUtil; +import edu.harvard.iq.dataverse.util.ListSplitUtil; import java.sql.Timestamp; import java.util.ArrayList; -import java.util.Arrays; import java.util.Date; import java.util.List; @@ -107,7 +107,7 @@ protected Dataverse innerExecute(CommandContext ctxt) throws IllegalCommandExcep // Add additional role assignments if inheritance is set boolean inheritAllRoles = false; String rolesString = ctxt.settings().getValueForKey(SettingsServiceBean.Key.InheritParentRoleAssignments, ""); - ArrayList rolesToInherit = new ArrayList(Arrays.asList(rolesString.split("\\s*,\\s*"))); + ArrayList rolesToInherit = new ArrayList<>(ListSplitUtil.split(rolesString)); if (rolesString.length() > 0) { if (!rolesToInherit.isEmpty()) { if (rolesToInherit.contains("*")) { diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/DRSSubmitToArchiveCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/DRSSubmitToArchiveCommand.java index 594d4fe25ba..78e8454255b 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/DRSSubmitToArchiveCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/DRSSubmitToArchiveCommand.java @@ -56,12 +56,13 @@ import com.auth0.jwt.JWT; import com.auth0.jwt.algorithms.Algorithm; import com.auth0.jwt.exceptions.JWTCreationException; +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.DRSArchiverConfig; @RequiredPermissions(Permission.PublishDataset) public class DRSSubmitToArchiveCommand extends S3SubmitToArchiveCommand implements Command { private static final Logger logger = Logger.getLogger(DRSSubmitToArchiveCommand.class.getName()); - private static final String DRS_CONFIG = ":DRSArchiverConfig"; + private static final String DRS_CONFIG = DRSArchiverConfig.toString(); private static final String ADMIN_METADATA = "admin_metadata"; private static final String S3_BUCKET_NAME = "s3_bucket_name"; private static final String S3_PATH = "s3_path"; diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/DuraCloudSubmitToArchiveCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/DuraCloudSubmitToArchiveCommand.java index 94f983f0c13..fe4a25091d7 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/DuraCloudSubmitToArchiveCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/DuraCloudSubmitToArchiveCommand.java @@ -7,6 +7,9 @@ import edu.harvard.iq.dataverse.authorization.users.ApiToken; import edu.harvard.iq.dataverse.engine.command.DataverseRequest; import edu.harvard.iq.dataverse.engine.command.RequiredPermissions; +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.DuraCloudContext; +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.DuraCloudHost; +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.DuraCloudPort; import edu.harvard.iq.dataverse.workflow.step.Failure; import edu.harvard.iq.dataverse.workflow.step.WorkflowStepResult; @@ -36,9 +39,9 @@ public class DuraCloudSubmitToArchiveCommand extends AbstractSubmitToArchiveComm private static final Logger logger = Logger.getLogger(DuraCloudSubmitToArchiveCommand.class.getName()); private static final String DEFAULT_PORT = "443"; private static final String DEFAULT_CONTEXT = "durastore"; - private static final String DURACLOUD_PORT = ":DuraCloudPort"; - private static final String DURACLOUD_HOST = ":DuraCloudHost"; - private static final String DURACLOUD_CONTEXT = ":DuraCloudContext"; + private static final String DURACLOUD_PORT = DuraCloudPort.toString(); + private static final String DURACLOUD_HOST = DuraCloudHost.toString(); + private static final String DURACLOUD_CONTEXT = DuraCloudContext.toString(); public DuraCloudSubmitToArchiveCommand(DataverseRequest aRequest, DatasetVersion version) { diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/GetDataverseMetadataLanguageCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/GetDataverseMetadataLanguageCommand.java new file mode 100644 index 00000000000..82b5527048b --- /dev/null +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/GetDataverseMetadataLanguageCommand.java @@ -0,0 +1,41 @@ +package edu.harvard.iq.dataverse.engine.command.impl; + +import java.util.Collections; +import java.util.Map; +import java.util.Set; + +import edu.harvard.iq.dataverse.Dataverse; +import edu.harvard.iq.dataverse.DvObjectContainer; +import edu.harvard.iq.dataverse.authorization.Permission; +import edu.harvard.iq.dataverse.engine.command.AbstractCommand; +import edu.harvard.iq.dataverse.engine.command.CommandContext; +import edu.harvard.iq.dataverse.engine.command.DataverseRequest; +import edu.harvard.iq.dataverse.engine.command.exception.CommandException; + +public class GetDataverseMetadataLanguageCommand extends AbstractCommand> { + + private final Dataverse dv; + + public GetDataverseMetadataLanguageCommand(DataverseRequest aRequest, Dataverse dv) { + super(aRequest, dv); + this.dv = dv; + } + + @Override + public Map execute(CommandContext ctxt) throws CommandException { + Map langMap = ctxt.settings().getBaseMetadataLanguageMap(null, true); + String dvMetadataLanguage = dv.getMetadataLanguage(); + if (!dvMetadataLanguage.equals(DvObjectContainer.UNDEFINED_CODE)) { + return Collections.singletonMap(dvMetadataLanguage, langMap.get(dvMetadataLanguage)); + } + return langMap; + + } + + @Override + public Map> getRequiredPermissions() { + return Collections.singletonMap("", + dv.isReleased() ? Collections.emptySet() + : Collections.singleton(Permission.ViewUnpublishedDataverse)); + } +} diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/GoogleCloudSubmitToArchiveCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/GoogleCloudSubmitToArchiveCommand.java index 7d749262b87..7dfb9f07e19 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/GoogleCloudSubmitToArchiveCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/GoogleCloudSubmitToArchiveCommand.java @@ -14,6 +14,8 @@ import edu.harvard.iq.dataverse.engine.command.DataverseRequest; import edu.harvard.iq.dataverse.engine.command.RequiredPermissions; import edu.harvard.iq.dataverse.settings.JvmSettings; +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.GoogleCloudBucket; +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.GoogleCloudProject; import edu.harvard.iq.dataverse.workflow.step.Failure; import edu.harvard.iq.dataverse.workflow.step.WorkflowStepResult; import org.apache.commons.codec.binary.Hex; @@ -35,8 +37,8 @@ public class GoogleCloudSubmitToArchiveCommand extends AbstractSubmitToArchiveCommand { private static final Logger logger = Logger.getLogger(GoogleCloudSubmitToArchiveCommand.class.getName()); - private static final String GOOGLECLOUD_BUCKET = ":GoogleCloudBucket"; - private static final String GOOGLECLOUD_PROJECT = ":GoogleCloudProject"; + private static final String GOOGLECLOUD_BUCKET = GoogleCloudBucket.toString(); + private static final String GOOGLECLOUD_PROJECT = GoogleCloudProject.toString(); public GoogleCloudSubmitToArchiveCommand(DataverseRequest aRequest, DatasetVersion version) { super(aRequest, version); diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/LocalSubmitToArchiveCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/LocalSubmitToArchiveCommand.java index d2f061b6e70..462879f2ec9 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/LocalSubmitToArchiveCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/LocalSubmitToArchiveCommand.java @@ -8,6 +8,7 @@ import edu.harvard.iq.dataverse.engine.command.Command; import edu.harvard.iq.dataverse.engine.command.DataverseRequest; import edu.harvard.iq.dataverse.engine.command.RequiredPermissions; +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.BagItLocalPath; import edu.harvard.iq.dataverse.util.bagit.BagGenerator; import edu.harvard.iq.dataverse.util.bagit.OREMap; import edu.harvard.iq.dataverse.workflow.step.Failure; @@ -38,7 +39,7 @@ public LocalSubmitToArchiveCommand(DataverseRequest aRequest, DatasetVersion ver public WorkflowStepResult performArchiveSubmission(DatasetVersion dv, ApiToken token, Map requestedSettings) { logger.fine("In LocalCloudSubmitToArchive..."); - String localPath = requestedSettings.get(":BagItLocalPath"); + String localPath = requestedSettings.get(BagItLocalPath.toString()); String zipName = null; //Set a failure status that will be updated if we succeed diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/S3SubmitToArchiveCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/S3SubmitToArchiveCommand.java index 4f93e88de5e..65531d775c8 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/S3SubmitToArchiveCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/S3SubmitToArchiveCommand.java @@ -7,6 +7,7 @@ import edu.harvard.iq.dataverse.authorization.users.ApiToken; import edu.harvard.iq.dataverse.engine.command.DataverseRequest; import edu.harvard.iq.dataverse.engine.command.RequiredPermissions; +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.S3ArchiverConfig; import edu.harvard.iq.dataverse.util.bagit.BagGenerator; import edu.harvard.iq.dataverse.util.bagit.OREMap; import edu.harvard.iq.dataverse.util.json.JsonUtil; @@ -64,7 +65,7 @@ public class S3SubmitToArchiveCommand extends AbstractSubmitToArchiveCommand { private ManagedExecutorService executorService; private static final Logger logger = Logger.getLogger(S3SubmitToArchiveCommand.class.getName()); - private static final String S3_CONFIG = ":S3ArchiverConfig"; + private static final String S3_CONFIG = S3ArchiverConfig.toString(); private static final Config config = ConfigProvider.getConfig(); protected S3AsyncClient s3 = null; diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/SetDataverseMetadataLanguageCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/SetDataverseMetadataLanguageCommand.java new file mode 100644 index 00000000000..438c332436a --- /dev/null +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/SetDataverseMetadataLanguageCommand.java @@ -0,0 +1,33 @@ +package edu.harvard.iq.dataverse.engine.command.impl; + +import java.util.Collections; +import java.util.Map; + +import edu.harvard.iq.dataverse.Dataverse; +import edu.harvard.iq.dataverse.DvObject; +import edu.harvard.iq.dataverse.authorization.Permission; +import edu.harvard.iq.dataverse.engine.command.AbstractCommand; +import edu.harvard.iq.dataverse.engine.command.CommandContext; +import edu.harvard.iq.dataverse.engine.command.DataverseRequest; +import edu.harvard.iq.dataverse.engine.command.RequiredPermissions; +import edu.harvard.iq.dataverse.engine.command.exception.CommandException; + +@RequiredPermissions(Permission.EditDataverse) +public class SetDataverseMetadataLanguageCommand extends AbstractCommand> { + + private Dataverse dv; + private String lang; + + public SetDataverseMetadataLanguageCommand(DataverseRequest aRequest, Dataverse dv, String lang) { + super(aRequest, dv); + this.dv = dv; + this.lang = lang; + } + + @Override + public Map execute(CommandContext ctxt) throws CommandException { + dv.setMetadataLanguage(lang); + return Collections.singletonMap(lang, ctxt.settings().getBaseMetadataLanguageMap(null, true).get(lang)); + } + +} diff --git a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/UpdateDatasetLicenseCommand.java b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/UpdateDatasetLicenseCommand.java index 37f97bf0db1..0d85dbb6f37 100644 --- a/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/UpdateDatasetLicenseCommand.java +++ b/src/main/java/edu/harvard/iq/dataverse/engine/command/impl/UpdateDatasetLicenseCommand.java @@ -13,28 +13,26 @@ import java.util.List; @RequiredPermissions(Permission.EditDataset) -public class UpdateDatasetLicenseCommand extends AbstractVoidCommand { - private final Dataset dataset; +public class UpdateDatasetLicenseCommand extends AbstractDatasetCommand { private License license = null; private TermsOfUseAndAccess customTermsOfUseAndAccess = null; public UpdateDatasetLicenseCommand(DataverseRequest aRequest, Dataset dataset, License license) { super(aRequest, dataset); - this.dataset = dataset; this.license = license; } public UpdateDatasetLicenseCommand(DataverseRequest aRequest, Dataset dataset, TermsOfUseAndAccess customTermsOfUseAndAccess) { super(aRequest, dataset); - this.dataset = dataset; this.customTermsOfUseAndAccess = customTermsOfUseAndAccess; } @Override - protected void executeImpl(CommandContext ctxt) throws CommandException { - DatasetVersion datasetVersion = dataset.getOrCreateEditVersion(); + public Dataset execute(CommandContext ctxt) throws CommandException { + DatasetVersion datasetVersion = getDataset().getOrCreateEditVersion(); datasetVersion.setVersionState(DatasetVersion.VersionState.DRAFT); + Dataset savedDataset = null; if (license != null) { if (!license.isActive()) { @@ -43,7 +41,7 @@ protected void executeImpl(CommandContext ctxt) throws CommandException { TermsOfUseAndAccess termsOfUseAndAccess = datasetVersion.getTermsOfUseAndAccess(); termsOfUseAndAccess.setLicense(license); - ctxt.engine().submit(new UpdateDatasetVersionCommand(this.dataset, getRequest())); + savedDataset = ctxt.engine().submit(new UpdateDatasetVersionCommand(getDataset(), getRequest())); } else if (customTermsOfUseAndAccess != null) { if (customTermsOfUseAndAccess.getTermsOfUse() == null || customTermsOfUseAndAccess.getTermsOfUse().isBlank()) { throw new InvalidCommandArgumentsException(BundleUtil.getStringFromBundle("updateDatasetLicenseCommand.errors.customTermsOfUseNotProvided"), this); @@ -52,8 +50,9 @@ protected void executeImpl(CommandContext ctxt) throws CommandException { applyCustomTerms(termsToUpdate, customTermsOfUseAndAccess); termsToUpdate.setLicense(null); datasetVersion.setTermsOfUseAndAccess(termsToUpdate); - ctxt.engine().submit(new UpdateDatasetVersionCommand(this.dataset, getRequest())); + savedDataset = ctxt.engine().submit(new UpdateDatasetVersionCommand(getDataset(), getRequest())); } + return savedDataset; } /** diff --git a/src/main/java/edu/harvard/iq/dataverse/filter/CorsFilter.java b/src/main/java/edu/harvard/iq/dataverse/filter/CorsFilter.java index 7d99d9ee4d2..d7f14fff245 100644 --- a/src/main/java/edu/harvard/iq/dataverse/filter/CorsFilter.java +++ b/src/main/java/edu/harvard/iq/dataverse/filter/CorsFilter.java @@ -1,20 +1,30 @@ package edu.harvard.iq.dataverse.filter; -import jakarta.inject.Inject; -import jakarta.servlet.*; -import jakarta.servlet.annotation.WebFilter; -import jakarta.servlet.http.HttpServletResponse; import java.io.IOException; +import java.util.Collections; +import java.util.HashSet; +import java.util.List; +import java.util.Set; +import java.util.stream.Collectors; import edu.harvard.iq.dataverse.settings.JvmSettings; -import edu.harvard.iq.dataverse.settings.SettingsServiceBean; +import edu.harvard.iq.dataverse.util.ListSplitUtil; +import jakarta.servlet.Filter; +import jakarta.servlet.FilterChain; +import jakarta.servlet.FilterConfig; +import jakarta.servlet.ServletException; +import jakarta.servlet.ServletRequest; +import jakarta.servlet.ServletResponse; +import jakarta.servlet.annotation.WebFilter; +import jakarta.servlet.http.HttpServletRequest; +import jakarta.servlet.http.HttpServletResponse; /** * CorsFilter is a servlet filter that handles Cross-Origin Resource Sharing (CORS) for the Dataverse application. * It configures and applies CORS headers to HTTP responses based on application settings. * * This filter: - * 1. Reads CORS configuration from JVM settings or (deprecated) the SettingsServiceBean. See the Dataverse Configuration Guide for more details. + * 1. Reads CORS configuration from JVM settings (dataverse.cors.*). See the Dataverse Configuration Guide for more details. * 2. Determines whether CORS should be allowed based on these settings. * 3. If CORS is allowed, it adds the appropriate CORS headers to all HTTP responses. The JVMSettings allow customization of the header contents if desired. * @@ -24,32 +34,33 @@ @WebFilter("/*") public class CorsFilter implements Filter { - @Inject - private SettingsServiceBean settingsSvc; - private boolean allowCors; - private String origin; + private boolean allowAllOrigins; + private Set allowedOrigins = Collections.emptySet(); private String methods; private String allowHeaders; private String exposeHeaders; @Override public void init(FilterConfig filterConfig) throws ServletException { - origin = JvmSettings.CORS_ORIGIN.lookupOptional().orElse(null); - boolean corsSetting = settingsSvc.isTrueForKey(SettingsServiceBean.Key.AllowCors, true); - - if (origin == null && !corsSetting) { - allowCors = false; - } else { - allowCors = true; - origin = (origin != null) ? origin : "*"; - } + List origins = JvmSettings.CORS_ORIGIN.lookupSplittedListOptional().orElse(List.of()); + allowCors = !origins.isEmpty(); if (allowCors) { - methods = JvmSettings.CORS_METHODS.lookupOptional().orElse("PUT, GET, POST, DELETE, OPTIONS"); - allowHeaders = JvmSettings.CORS_ALLOW_HEADERS.lookupOptional() + if (origins.contains("*")) { + allowAllOrigins = true; + } else { + allowedOrigins = Set.copyOf(origins); + } + + methods = JvmSettings.CORS_METHODS.lookupSplittedListOptional() + .map(values -> String.join(", ", values)) + .orElse("GET, POST, OPTIONS, PUT, DELETE"); + allowHeaders = JvmSettings.CORS_ALLOW_HEADERS.lookupSplittedListOptional() + .map(values -> String.join(", ", values)) .orElse("Accept, Content-Type, X-Dataverse-key, Range"); - exposeHeaders = JvmSettings.CORS_EXPOSE_HEADERS.lookupOptional() + exposeHeaders = JvmSettings.CORS_EXPOSE_HEADERS.lookupSplittedListOptional() + .map(values -> String.join(", ", values)) .orElse("Accept-Ranges, Content-Range, Content-Encoding"); } } @@ -58,12 +69,35 @@ public void init(FilterConfig filterConfig) throws ServletException { public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse, FilterChain chain) throws IOException, ServletException { if (allowCors) { + HttpServletRequest request = (HttpServletRequest) servletRequest; HttpServletResponse response = (HttpServletResponse) servletResponse; - response.addHeader("Access-Control-Allow-Origin", origin); - response.addHeader("Access-Control-Allow-Methods", methods); - response.addHeader("Access-Control-Allow-Headers", allowHeaders); - response.addHeader("Access-Control-Expose-Headers", exposeHeaders); + + String originHeader = request.getHeader("Origin"); + String requestOrigin = originHeader == null ? null : originHeader.trim(); + + if (allowAllOrigins) { + response.setHeader("Access-Control-Allow-Origin", "*"); + } else if (requestOrigin != null && allowedOrigins.contains(requestOrigin)) { + response.setHeader("Access-Control-Allow-Origin", requestOrigin); + response.setHeader("Vary", appendVary(response.getHeader("Vary"), "Origin")); + } + + response.setHeader("Access-Control-Allow-Methods", methods); + response.setHeader("Access-Control-Allow-Headers", allowHeaders); + response.setHeader("Access-Control-Expose-Headers", exposeHeaders); } chain.doFilter(servletRequest, servletResponse); } + + private String appendVary(String existing, String value) { + if (existing == null || existing.isEmpty()) { + return value; + } + Set tokens = ListSplitUtil.split(existing).stream() + .map(String::trim) + .filter(token -> !token.isEmpty()) + .collect(Collectors.toCollection(HashSet::new)); + tokens.add(value); + return String.join(", ", tokens); + } } diff --git a/src/main/java/edu/harvard/iq/dataverse/flyway/SettingsCleanupCallback.java b/src/main/java/edu/harvard/iq/dataverse/flyway/SettingsCleanupCallback.java new file mode 100644 index 00000000000..4b02f07a810 --- /dev/null +++ b/src/main/java/edu/harvard/iq/dataverse/flyway/SettingsCleanupCallback.java @@ -0,0 +1,103 @@ +package edu.harvard.iq.dataverse.flyway; + +import edu.harvard.iq.dataverse.settings.SettingsServiceBean; +import org.flywaydb.core.api.FlywayException; +import org.flywaydb.core.api.callback.Callback; +import org.flywaydb.core.api.callback.Context; +import org.flywaydb.core.api.callback.Event; + +import java.sql.Connection; +import java.sql.PreparedStatement; +import java.sql.ResultSet; +import java.sql.SQLException; +import java.util.ArrayList; +import java.util.List; +import java.util.logging.Level; +import java.util.logging.Logger; + +/** + * Flyway callback that runs after all migrations and removes any settings + * whose "name" column does not correspond to a SettingsServiceBean.Key. + * + * This enforces that the settings table contains only keys known to the + * current application version. + */ +public class SettingsCleanupCallback implements Callback { + + private static final Logger logger = Logger.getLogger(SettingsCleanupCallback.class.getName()); + + @Override + public boolean supports(Event event, Context context) { + // Only run after all migrations have completed successfully. + return event == Event.AFTER_MIGRATE; + } + + @Override + public boolean canHandleInTransaction(Event event, Context context) { + // Prefer to run inside the same transaction + return true; + } + + @Override + public void handle(Event event, Context context) { + if (event != Event.AFTER_MIGRATE) { + return; + } + + logger.info("Starting settings cleanup: removing entries with unknown keys"); + + try { + cleanupInvalidSettings(context.getConnection()); + } catch (SQLException e) { + logger.log(Level.SEVERE, "Error while cleaning up settings table", e); + throw new FlywayException("Failed to clean up invalid settings", e); + } + + logger.info("Finished cleaning up settings"); + } + + @Override + public String getCallbackName() { + return "SettingsCleanup"; + } + + private void cleanupInvalidSettings(Connection connection) throws SQLException { + // Collect IDs of rows to delete + List idsToDelete = new ArrayList<>(); + + String selectSql = "SELECT id, name FROM setting"; + try (PreparedStatement ps = connection.prepareStatement(selectSql); + ResultSet rs = ps.executeQuery()) { + + while (rs.next()) { + long id = rs.getLong("id"); + String name = rs.getString("name"); + + // We expect names like ":KeyName". Anything that does not parse + // to a SettingsServiceBean.Key is considered invalid and will be removed. + SettingsServiceBean.Key key = SettingsServiceBean.Key.parse(name); + if (key == null) { + idsToDelete.add(id); + } + } + } + + if (idsToDelete.isEmpty()) { + logger.fine("Settings cleanup: no invalid settings found"); + return; + } + + logger.info(() -> "Settings cleanup: found " + idsToDelete.size() + + " invalid settings; deleting them"); + + String deleteSql = "DELETE FROM setting WHERE id = ?"; + try (PreparedStatement delete = connection.prepareStatement(deleteSql)) { + for (Long id : idsToDelete) { + delete.setLong(1, id); + delete.addBatch(); + } + int[] counts = delete.executeBatch(); + logger.info(() -> "Settings cleanup: deleted " + counts.length + " rows with invalid keys"); + } + } +} diff --git a/src/main/java/edu/harvard/iq/dataverse/flyway/StartupFlywayMigrator.java b/src/main/java/edu/harvard/iq/dataverse/flyway/StartupFlywayMigrator.java index 39bc46216ca..06c6048c65a 100644 --- a/src/main/java/edu/harvard/iq/dataverse/flyway/StartupFlywayMigrator.java +++ b/src/main/java/edu/harvard/iq/dataverse/flyway/StartupFlywayMigrator.java @@ -27,6 +27,14 @@ void migrateDatabase() { Flyway flyway = Flyway.configure() .dataSource(dataSource) + .locations( + // Path where to find normal SQL migrations + "classpath:db/migration", + // Path where to find compiled Java migrations + "classpath:edu/harvard/iq/dataverse/flyway" + ) + // Java-based callbacks are not auto-discovered (unlike migrations) + .callbacks(new SettingsCleanupCallback()) .baselineOnMigrate(true) .load(); diff --git a/src/main/java/edu/harvard/iq/dataverse/pidproviders/AbstractPidProvider.java b/src/main/java/edu/harvard/iq/dataverse/pidproviders/AbstractPidProvider.java index 0b5b49fc52d..0affd32eb99 100644 --- a/src/main/java/edu/harvard/iq/dataverse/pidproviders/AbstractPidProvider.java +++ b/src/main/java/edu/harvard/iq/dataverse/pidproviders/AbstractPidProvider.java @@ -7,6 +7,7 @@ import edu.harvard.iq.dataverse.DatasetVersion; import edu.harvard.iq.dataverse.DvObject; import edu.harvard.iq.dataverse.GlobalId; +import edu.harvard.iq.dataverse.util.ListSplitUtil; import edu.harvard.iq.dataverse.util.SystemConfig; import jakarta.json.Json; import jakarta.json.JsonObject; @@ -60,10 +61,10 @@ protected AbstractPidProvider(String id, String label, String protocol, String a this.identifierGenerationStyle = identifierGenerationStyle; this.datafilePidFormat = datafilePidFormat; if(!managedList.isEmpty()) { - this.managedSet.addAll(Arrays.asList(managedList.split(",\\s"))); + this.managedSet.addAll(ListSplitUtil.split(managedList)); } if(!excludedList.isEmpty()) { - this.excludedSet.addAll(Arrays.asList(excludedList.split(",\\s"))); + this.excludedSet.addAll(ListSplitUtil.split(excludedList)); } if (logger.isLoggable(Level.FINE)) { Iterator iter = managedSet.iterator(); diff --git a/src/main/java/edu/harvard/iq/dataverse/pidproviders/PidProviderFactoryBean.java b/src/main/java/edu/harvard/iq/dataverse/pidproviders/PidProviderFactoryBean.java index 1bd49bc7f6e..267cbab3edd 100644 --- a/src/main/java/edu/harvard/iq/dataverse/pidproviders/PidProviderFactoryBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/pidproviders/PidProviderFactoryBean.java @@ -12,7 +12,6 @@ import java.util.HashMap; import java.util.List; import java.util.Map; -import java.util.NoSuchElementException; import java.util.Optional; import java.util.ServiceLoader; import java.util.logging.Level; @@ -23,11 +22,9 @@ import jakarta.ejb.Singleton; import jakarta.ejb.Startup; import jakarta.inject.Inject; -import jakarta.json.JsonObject; import edu.harvard.iq.dataverse.settings.JvmSettings; import edu.harvard.iq.dataverse.settings.SettingsServiceBean; import edu.harvard.iq.dataverse.util.SystemConfig; -import edu.harvard.iq.dataverse.DatasetFieldServiceBean; import edu.harvard.iq.dataverse.DataverseServiceBean; import edu.harvard.iq.dataverse.DvObjectServiceBean; import edu.harvard.iq.dataverse.GlobalId; @@ -121,14 +118,12 @@ private void loadProviderFactories() { } private void loadProviders() { - Optional providers = JvmSettings.PID_PROVIDERS.lookupOptional(String[].class); - if (!providers.isPresent()) { + Optional> providersOpt = JvmSettings.PID_PROVIDERS.lookupSplittedListOptional(); + if (!providersOpt.isPresent() || providersOpt.get().isEmpty()) { logger.warning( "No PidProviders configured via dataverse.pid.providers. Please consider updating as older PIDProvider configuration mechanisms will be removed in a future version of Dataverse."); } else { - for (String id : providers.get()) { - //Allows spaces in PID_PROVIDERS setting - id=id.trim(); + for (String id : providersOpt.get()) { Optional type = JvmSettings.PID_PROVIDER_TYPE.lookupOptional(id); if (!type.isPresent()) { logger.warning("PidProvider " + id diff --git a/src/main/java/edu/harvard/iq/dataverse/settings/JvmSettings.java b/src/main/java/edu/harvard/iq/dataverse/settings/JvmSettings.java index 87123801a3e..07dc417ba1f 100644 --- a/src/main/java/edu/harvard/iq/dataverse/settings/JvmSettings.java +++ b/src/main/java/edu/harvard/iq/dataverse/settings/JvmSettings.java @@ -309,6 +309,7 @@ public enum JvmSettings { private final String key; private final String scopedKey; + @SuppressWarnings("unused") private final JvmSettings parent; private final List oldNames; private final int placeholders; @@ -608,4 +609,73 @@ public String insert(String... arguments) { return String.format(this.getScopedKey(), (Object[]) arguments); } + /** + * Lookup an optional comma-separated value and return the tokens as an immutable list. + * MicroProfile Config removes zero-length segments when it converts to {@code String[]}, but + * it leaves any leading or trailing whitespace on the surviving tokens (including tokens that + * contain only spaces). This convenience overload trims each token; after trimming, any token + * that becomes empty (because it consisted solely of whitespace) is discarded so callers still + * receive a list that is free of empty strings. Use the boolean overload with {@code false} if + * you need the exact whitespace that MicroProfile provided. + * + * @return an {@link Optional} containing the list of tokens when the setting is present; + * an empty {@link Optional} if the setting is not configured + */ + public Optional> lookupSplittedListOptional() { + return lookupSplittedListOptional(true); + } + + /** + * Lookup an optional comma-separated value and return the tokens as an immutable list. + * + * @param trimSpaces when {@code true}, individual elements are trimmed; tokens that become empty after + * trimming (because they were all whitespace) are removed to preserve MicroProfile's + * "no empty entries" guarantee; when {@code false}, the tokens are returned exactly as + * produced by MicroProfile Config + * @return an {@link Optional} containing the list of tokens when the setting is present; + * an empty {@link Optional} if the setting is not configured + */ + public Optional> lookupSplittedListOptional(boolean trimSpaces) { + return lookupOptional(String[].class) + .map(values -> Arrays.stream(values) + .map(s -> trimSpaces ? s.trim() : s) + .filter(s -> trimSpaces ? !s.isEmpty() : true) + .toList()); + } + + /** + * Lookup a required comma-separated value and return the tokens as an immutable list. + * MicroProfile Config removes zero-length segments when it converts to {@code String[]}, but it + * leaves any leading or trailing whitespace on the surviving tokens (including tokens that contain + * only spaces). This convenience overload trims each token; after trimming, any token that becomes + * empty (because it consisted solely of whitespace) is discarded so callers still receive a list that + * is free of empty strings. Use the boolean overload with {@code false} if you need the exact whitespace + * that MicroProfile provided. + * + * @return the list of tokens for the configured setting + * @throws java.util.NoSuchElementException if the setting is missing or blank + * @throws IllegalArgumentException if conversion to {@code String[]} fails + */ + public List lookupSplittedList() { + return lookupSplittedList(true); + } + + /** + * Lookup a required comma-separated value and return the tokens as an immutable list. + * + * @param trimSpaces when {@code true}, individual elements are trimmed; tokens that become empty after + * trimming (because they were all whitespace) are removed to preserve MicroProfile's + * "no empty entries" guarantee; when {@code false}, the tokens are returned exactly as + * produced by MicroProfile Config + * @return the list of tokens for the configured setting + * @throws java.util.NoSuchElementException if the setting is missing or blank + * @throws IllegalArgumentException if conversion to {@code String[]} fails + */ + public List lookupSplittedList(boolean trimSpaces) { + return Arrays.stream(lookup(String[].class)) + .map(s -> trimSpaces ? s.trim() : s) + .filter(s -> trimSpaces ? !s.isEmpty() : true) + .toList(); + } + } diff --git a/src/main/java/edu/harvard/iq/dataverse/settings/Setting.java b/src/main/java/edu/harvard/iq/dataverse/settings/Setting.java index b1910a2fbb5..e187d3db1cc 100644 --- a/src/main/java/edu/harvard/iq/dataverse/settings/Setting.java +++ b/src/main/java/edu/harvard/iq/dataverse/settings/Setting.java @@ -9,6 +9,8 @@ import jakarta.persistence.NamedQuery; import jakarta.persistence.GeneratedValue; import jakarta.persistence.GenerationType; +import jakarta.persistence.Table; +import jakarta.persistence.UniqueConstraint; /** * A single value in the config of dataverse. @@ -16,45 +18,69 @@ */ @NamedQueries({ @NamedQuery( name="Setting.deleteByName", - query="DELETE FROM Setting s WHERE s.name=:name AND s.lang IS NULL"), + query="DELETE FROM Setting s WHERE s.name=:name AND s.lang=''"), @NamedQuery( name="Setting.findAll", query="SELECT s FROM Setting s"), + @NamedQuery( name="Setting.findAllWithoutLang", + query="SELECT s FROM Setting s WHERE s.lang=''"), @NamedQuery( name="Setting.findByName", - query = "SELECT s FROM Setting s WHERE s.name=:name AND s.lang IS NULL" ), + query="SELECT s FROM Setting s WHERE s.name=:name AND s.lang=''"), @NamedQuery( name="Setting.deleteByNameAndLang", - query="DELETE FROM Setting s WHERE s.name=:name AND s.lang=:lang"), + query="DELETE FROM Setting s WHERE s.name=:name AND s.lang=:lang"), @NamedQuery( name="Setting.findByNameAndLang", - query = "SELECT s FROM Setting s WHERE s.name=:name AND s.lang=:lang" ) - + query="SELECT s FROM Setting s WHERE s.name=:name AND s.lang=:lang") }) @Entity +@Table(uniqueConstraints = { + @UniqueConstraint(name = "uc_setting_name_lang", columnNames = {"name", "lang"}), +}) public class Setting implements Serializable { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; - @Column(columnDefinition = "TEXT") + @Column(length = 200, nullable = false) private String name; - - @Column(columnDefinition = "TEXT") - private String lang; + + /** + * The default value is an empty string, which indicates no specific language is set. + * Using a NULL value here instead would allow the UNIQUE constraint to fail blocking duplicate settings. + * Allowing multiple null within a UNIQUE constraint is part of the SQL standard, which Postgres follows. + * As it stores ISO codes, 10 chars is good enough (ISO codes are 2-8 chars by spec) + */ + @Column(length = 10, nullable = false) + private String lang = ""; @Column(columnDefinition = "TEXT") private String content; - public Setting() { + protected Setting() { + // Intentionally left blank - no empty settings allowed. + // Protected visibility to allow JPA to work. } public Setting(String name, String content) { - this.name = name; - this.content = content; + Objects.requireNonNull(name, "Setting name cannot be null"); + this.name = name; + this.content = content; } - + + /** + * Constructs a new Setting object with the specified name, language, and content. + * + * @param name the name of the setting; must not be null + * @param lang the language of the setting, represented as an ISO code; must not be null; + * may be empty to represent a non-localized setting. + * @param content the content or value associated with this setting + * @throws NullPointerException if the name or lang parameters are null + */ public Setting(String name, String lang, String content) { + Objects.requireNonNull(name, "Setting name cannot be null"); + Objects.requireNonNull(lang, "Setting lang cannot be null"); this.name = name; - this.content = content; this.lang = lang; + this.content = content; } public String getName() { @@ -62,6 +88,7 @@ public String getName() { } public void setName(String name) { + Objects.requireNonNull(name, "Setting name cannot be null"); this.name = name; } @@ -72,37 +99,60 @@ public String getContent() { public void setContent(String content) { this.content = content; } - + + /** + * Retrieves the language associated with this Setting instance. + * The language is represented as an ISO code string. + * An empty string indicates that no specific localization is set. + * + * @return the language code of this Setting; never null + */ public String getLang() { return lang; } - + + /** + * Sets the language for this Setting instance. + * The language is represented as a non-null ISO code string. + * An empty string indicates that no specific localization shall be set. + * + * @param lang the language code to set; must not be null + * @throws NullPointerException if the provided lang parameter is null + */ public void setLang(String lang) { + Objects.requireNonNull(lang, "Setting lang cannot be null"); this.lang = lang; } @Override public int hashCode() { - int hash = 7; - hash = 73 * hash + Objects.hashCode(this.name); - return hash; + return Objects.hash(name, lang); } - + + /** + * Compares this Setting instance to another object for equality. Two Setting + * objects are considered equal if their {@code name} and {@code lang} fields are + * both equal. + * @implNote We do not use the {@code id} and {@code content} fields for the comparison. + * This is due to how these objects usually are used: + * - Mutable content to use for comparison may break collections. + * - Configuration management requires stable identity based on setting's name and localization. + * The content of the settings is irrelevant for lookups. + * + * @param obj the object to compare this Setting with + * @return {@code true} if the specified object is equal to this Setting, {@code false} otherwise + */ @Override public boolean equals(Object obj) { - if (obj == null) { - return false; - } - if ( !(obj instanceof Setting) ) { - return false; + if (this == obj) { + return true; } - final Setting other = (Setting) obj; - if (!Objects.equals(this.name, other.name)) { + if (!(obj instanceof Setting other)) { return false; } - return Objects.equals(this.content, other.content); + return Objects.equals(this.name, other.name) && Objects.equals(this.lang, other.lang); } - + @Override public String toString() { return "[Setting name:" + getName() + " value:" + getContent() + "]"; diff --git a/src/main/java/edu/harvard/iq/dataverse/settings/SettingsServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/settings/SettingsServiceBean.java index b323a9b7861..1cdac02a013 100644 --- a/src/main/java/edu/harvard/iq/dataverse/settings/SettingsServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/settings/SettingsServiceBean.java @@ -7,9 +7,16 @@ import edu.harvard.iq.dataverse.util.json.JsonUtil; import jakarta.ejb.EJB; import jakarta.ejb.Stateless; +import jakarta.ejb.TransactionAttribute; +import jakarta.ejb.TransactionAttributeType; import jakarta.inject.Named; +import jakarta.json.Json; import jakarta.json.JsonArray; +import jakarta.json.JsonArrayBuilder; +import jakarta.json.JsonException; import jakarta.json.JsonObject; +import jakarta.json.JsonObjectBuilder; +import jakarta.json.JsonString; import jakarta.json.JsonValue; import jakarta.persistence.EntityManager; import jakarta.persistence.PersistenceContext; @@ -18,16 +25,22 @@ import org.json.JSONException; import org.json.JSONObject; +import java.util.ArrayList; +import java.util.Arrays; import java.util.Collections; import java.util.HashMap; import java.util.HashSet; import java.util.LinkedHashMap; import java.util.List; +import java.util.Locale; import java.util.Map; +import java.util.Objects; import java.util.Set; import java.util.StringTokenizer; +import java.util.function.Function; import java.util.logging.Level; import java.util.logging.Logger; +import java.util.stream.Collectors; /** * Service bean accessing a persistent hash map, used as settings in the application. @@ -159,6 +172,20 @@ public enum Key { @Deprecated(forRemoval = true, since = "2025-04-29") BlockedApiPolicy, + /** + * Semicolon separated allowlist of IP addresses allowed administrative access to workflows. + */ + WorkflowsAdminIpWhitelist, + + /** + * A special secret that, if set, needs to be given when trying to manage internal users. + * This key was formerly known as "BuiltinUsers.KEY", which never was a setting name aligning with the others. + * At some future point this setting should be moved to JvmSettings (so we consume proper secrets) + * or plainly removed with the transition to the SPA frontend requiring an external IdP. + */ + @Deprecated(forRemoval = true, since = "2025-08-01") + BuiltinUsersKey, + /** * For development only (see dev guide for details). Backed by an enum * of possible account types. @@ -445,7 +472,46 @@ Whether Harvesting (OAI) service is enabled */ ArchiverClassName, + /** + * Custom settings for each archiver. See list below. + */ ArchiverSettings, + /** + * :ArchiverSettings used by DRSSubmitToArchiveCommand. DRS is a system + * specific to Harvard which is why we don't document it in the guides. + * See also https://github.com/IQSS/dataverse.harvard.edu/issues/177 + */ + DRSArchiverConfig, + /** + * :ArchiverSettings used by DuraCloudSubmitToArchiveCommand. + */ + DuraCloudPort, + DuraCloudHost, + DuraCloudContext, + /** + * :ArchiverSettings used by GoogleCloudSubmitToArchiveCommand. + */ + GoogleCloudBucket, + GoogleCloudProject, + /** + * :ArchiverSettings used by LocalSubmitToArchiveCommand. + */ + BagItLocalPath, + /** + * :ArchiverSettings used by S3SubmitToArchiveCommand. + */ + S3ArchiverConfig, + /** + * :ArchiverSettings used by multiple archive commands. + */ + BagGeneratorThreads, + /** + * Various BagIt settings. + */ + BagValidatorJobPoolSize, + BagValidatorMaxErrors, + BagValidatorJobWaitInterval, + BagItHandlerEnabled, /** * A comma-separated list of roles for which new dataverses should inherit the * corresponding role assignments from the parent dataverse. Also affects @@ -622,6 +688,8 @@ Whether Harvesting (OAI) service is enabled * LDN Inbox Allowed Hosts - a comma separated list of IP addresses allowed to submit messages to the inbox */ LDNMessageHosts, + LDNAnnounceRequiredFields, + LDNTarget, /* * Allow a custom JavaScript to control values of specific fields. @@ -678,6 +746,8 @@ Whether Harvesting (OAI) service is enabled * files *with* the variable names line up top. */ StoreIngestedTabularFilesWithVarHeaders, + FileCategories, + CreateDataFilesMaxErrorsToDisplay, ContactFeedbackMessageSizeLimit, //Experimental setting to allow connecting to a GET external search service expecting a GET request with query parameter mirroring the search API query parameters (without search_service) @@ -694,11 +764,52 @@ Whether Harvesting (OAI) service is enabled public String toString() { return ":" + name(); } + + /** + * Parses the input string to match a corresponding {@code SettingsServiceBean.Key}. + * The method expects the input string to start with a colon (:) followed by the key name. + * If the key name matches one of the existing {@code SettingsServiceBean.Key} enumerations, + * the corresponding key is returned. The check is case-sensitive. + * + * @param key the input string in the format ":KeyName", where "KeyName" corresponds + * to the name of an enumeration in {@code SettingsServiceBean.Key}. + * If {@code key} is null, blank, does not start with a colon (:), or does not + * match any known key, the method returns {@code null}. + * @return the corresponding {@code SettingsServiceBean.Key} if the key matches one + * of the predefined keys, or {@code null} if no match is found. + */ + public static SettingsServiceBean.Key parse(String key) { + // Null safety and format check + if (key == null || key.isBlank() || key.charAt(0) != ':') return null; + + // Cut off the ":" we verified is present before + String normalizedKey = key.substring(1); + + // Iterate through all the known keys and return on match (case sensitive!) + // We are case sensitive here because Dataverse implicitely uses case sensitive keys everywhere! + for (SettingsServiceBean.Key k : SettingsServiceBean.Key.values()) { + if (k.name().equals(normalizedKey)) { + return k; + } + } + + // Fall through on no match + return null; + } } @PersistenceContext EntityManager em; + /** + * A reference to the current instance of the SettingsServiceBean. + * Used when self-invocation is required for internal method calls + * within the same bean to ensure that all EJB functionalities + * such as transactions and security are properly applied. + */ + @EJB + private SettingsServiceBean self; + @EJB ActionLogServiceBean actionLogSvc; @@ -706,7 +817,11 @@ public String toString() { * Basic functionality - get the name, return the setting, or {@code null}. * @param name of the setting * @return the actual setting, or {@code null}. + * + * @deprecated This will be removed in a future version of Dataverse. Please refrain from using it and migrate + * any code doing so to use a {@link Key} and the {@link #getValueForKey(Key)} variants instead. */ + @Deprecated(since = "6.9", forRemoval = true) public String get( String name ) { List tokens = em.createNamedQuery("Setting.findByName", Setting.class) .setParameter("name", name ) @@ -850,13 +965,25 @@ public Boolean getValueForCompoundKeyAsBoolean(Key key, String param) { * @param name Name of the setting. * @param defaultValue The value to return if no setting is found in the DB. * @return Either the stored value, or the default value. + * + * @deprecated This will be removed in a future version of Dataverse. Please refrain from using it and migrate + * any code doing so to use a {@link Key} and the {@link #getValueForKey(Key)} variants instead. */ + @Deprecated(since = "6.9", forRemoval = true) public String get( String name, String defaultValue ) { String val = get(name); return (val!=null) ? val : defaultValue; } + /** + * @deprecated This will be removed in a future version of Dataverse. Please refrain from using it and migrate + * any code doing so to use a {@link Key} and the {@link #getValueForKey(Key)} variants instead. + */ + @Deprecated(since = "6.9", forRemoval = true) public String get(String name, String lang, String defaultValue ) { + // Database safeguard, as the default is an empty string + if (lang == null) lang = ""; + List tokens = em.createNamedQuery("Setting.findByNameAndLang", Setting.class) .setParameter("name", name ) .setParameter("lang", lang ) @@ -873,9 +1000,17 @@ public String getValueForKey( Key key, String defaultValue ) { } public String getValueForKey( Key key, String lang, String defaultValue ) { + // Database safeguard, as the default is an empty string + if (lang == null) lang = ""; + return get( key.toString(), lang, defaultValue ); } - + + /** + * @deprecated This will be removed in a future version of Dataverse. Please refrain from using it and migrate + * any code doing so to use a {@link Key} and the {@link #setValueForKey(Key, String)} variants instead. + */ + @Deprecated(since = "6.9", forRemoval = true) public Setting set( String name, String content ) { Setting s = null; @@ -898,8 +1033,16 @@ public Setting set( String name, String content ) { .setInfo(name + ": " + content)); return s; } - + + /** + * @deprecated This will be removed in a future version of Dataverse. Please refrain from using it and migrate + * any code doing so to use a {@link Key} and the {@link #setValueForKey(Key, String)} variants instead. + */ + @Deprecated(since = "6.9", forRemoval = true) public Setting set( String name, String lang, String content ) { + // Database safeguard, as the default is an empty string + if (lang == null) lang = ""; + Setting s = null; List tokens = em.createNamedQuery("Setting.findByNameAndLang", Setting.class) @@ -933,7 +1076,11 @@ public Setting setValueForKey( Key key, String content ) { * @param name name of the setting. * @param defaultValue logical value of {@code null}. * @return boolean value of the setting. + * + * @deprecated This will be removed in a future version of Dataverse. Please refrain from using it and migrate + * any code doing so to use a {@link Key} and {@link #isTrueForKey(Key, boolean)} instead. */ + @Deprecated(since = "6.9", forRemoval = true) public boolean isTrue( String name, boolean defaultValue ) { String val = get(name); return ( val==null ) ? defaultValue : StringUtil.isTrue(val); @@ -960,6 +1107,11 @@ public void deleteValueForKey( Key name ) { delete( name.toString() ); } + /** + * @deprecated This will be removed in a future version of Dataverse. Please refrain from using it and migrate + * any code doing so to use a {@link Key} and {@link #deleteValueForKey(Key)} instead. + */ + @Deprecated(since = "6.9", forRemoval = true) public void delete( String name ) { actionLogSvc.log( new ActionLogRecord(ActionLogRecord.ActionType.Setting, "delete") .setInfo(name)); @@ -967,8 +1119,16 @@ public void delete( String name ) { .setParameter("name", name) .executeUpdate(); } - + + /** + * @deprecated This will be removed in a future version of Dataverse. Please refrain from using it and migrate + * any code doing so to use a {@link Key} and {@link #deleteValueForKey(Key)} instead. + */ + @Deprecated(since = "6.9", forRemoval = true) public void delete( String name, String lang ) { + // Database safeguard, as the default is an empty string + if (lang == null) lang = ""; + actionLogSvc.log( new ActionLogRecord(ActionLogRecord.ActionType.Setting, "delete") .setInfo(name)); em.createNamedQuery("Setting.deleteByNameAndLang") @@ -977,8 +1137,268 @@ public void delete( String name, String lang ) { .executeUpdate(); } - public Set listAll() { - return new HashSet<>(em.createNamedQuery("Setting.findAll", Setting.class).getResultList()); + /** + * Retrieves all settings that do not have any language localizations. + * This method uses a named query to fetch settings where the language field is null. + * + * @return a set of {@link Setting} objects that do not have language localizations. + */ + public Set listAllWithoutLocalizations() { + return new HashSet<>(em.createNamedQuery("Setting.findAllWithoutLang", Setting.class).getResultList()); + } + + public static final String L10N_KEY_SEPARATOR = "/lang/"; + + /** + * Retrieves all settings from the database and converts them into a JSON object. + * Each setting is represented as a key-value pair in the JSON object. The key + * is the setting name, optionally appended with the language if the setting is + * language-specific, while the value corresponds to the setting's content. + * + * @return A {@link JsonObject} containing all settings from the database, structured + * with their names (and languages, if applicable) as keys and their + * respective contents as values. + * Shortened Example: + * + * { + * ":FilePIDsEnabled": "false", + * ":ApplicationTermsOfUse": "Non-localized default / fallback terms.", + * ":ApplicationTermsOfUse/lang/fr": "Il s'agit de termes localisés en français.", + * ":MaxFileUploadSizeInBytes": { + * "default": "2147483648", + * "fileOne": "4000000000", + * "s3": "8000000000" + * } + * } + * + * + * @implNote The reason to use a flattened approach for the localized settings is to stay backward compatible. + * Per good practice, a bulk operation should be a composite of the single operation. + * As you need to provide the language parameter to query or put them single, the localization is not + * part of the content model, but of the {@link Setting} data model. Using a JSON sub-object or using + * a separated approach is possible, but adds additional complexity. In case of the sub-object it even + * violates that the value you retrieve from the bulk operation can be used for a single operation again. + * As long as we do not update our content model, but store the language as part of the data model, + * this flattening seems to be the most balanced compromise. + */ + public JsonObject listAllAsJson() { + Set settings = new HashSet<>(em.createNamedQuery("Setting.findAll", Setting.class).getResultList()); + JsonObjectBuilder response = Json.createObjectBuilder(); + + // Iterate over all the settings and add them to the response. + settings.forEach(setting -> { + String name = convertToJsonKey(setting); + + try { + // In case the setting is JSON, treat it as such in the output (so the API can return valid JSON) + response.add(name, JsonUtil.getJsonValue(setting.getContent())); + } catch (JsonException e) { + // This wasn't valid JSON, so we just add it as a string + response.add(name, setting.getContent()); + } + }); + + return response.build(); + } + + /** + * Updates all current settings from the specified JSON object. Validates the input JSON, + * converts it to a set of settings and replaces all existing settings with the new ones + * in an atomic operation. If the settings object is null, contains invalid keys, or if the new + * set of settings is empty, the method throws an appropriate exception. + * + * @param settings the JSON object containing the new configuration settings to be applied; must not be null + * @return a JsonObjectBuilder representing the operational details of the applied updates + * @throws SettingsValidationException if the settings object is null, contains invalid keys or results in empty settings + */ + public JsonObjectBuilder setAllFromJson(JsonObject settings) { + if (settings == null) { + throw new SettingsValidationException("Settings cannot be null"); + } + + // Validate the input + List invalidKeys = validateKeys(settings); + if (!invalidKeys.isEmpty()) { + throw new SettingsValidationException("Invalid key(s): " + String.join(", ", invalidKeys)); + } + + // Convert JSON to Setting objects + Set newSettings = convertJsonToSettings(settings); + + // Perform atomic update (replace all settings) + // We don't allow to completely wipe all settings coming from JSON here, so no acciddents happen. + // (It's completely unrealistic someone would try to remove all settings and leave it at that.) + if (newSettings != null && !newSettings.isEmpty()) { + // Execute the update (in one atomic operation using a transaction) + // Note: We need to call via self-reference so the EJB container can create a transaction as intended. + Map operationalDetails = self.replaceAllSettings(newSettings); + + return Op.convertToJson(operationalDetails); + } + throw new SettingsValidationException("Settings cannot be empty - you'd wipe the entire configuration."); + } + + /** + * Converts a JSON object representing settings into a list of Setting objects. + * Each entry in the JSON object is processed to create a Setting instance. + * If the key includes a language (indicated by a separator), the language + * information is extracted and included in the Setting object. + * Note: This method expects a pre-validated JsonObject and will happily create + * nonsense settings for you otherwise. This is a reason for the package visibility. + * + * @param settings a (pre-validated) {@link JsonObject} containing key-value pairs where + * each key represents a setting name (and optionally a language code), + * and each value represents the associated content. + * @return a {@link List} of {@link Setting} objects parsed from the input JSON object. + */ + static Set convertJsonToSettings(JsonObject settings) { + Objects.requireNonNull(settings, "The settings object cannot be null."); + return settings.entrySet().stream() + .map(entry -> { + String key = entry.getKey(); + + String value; + JsonValue jsonValue = entry.getValue(); + if (jsonValue.getValueType() == JsonValue.ValueType.STRING) { + // For string values, get the actual string content (unescaped) + value = ((JsonString) jsonValue).getString(); + } else { + // For objects, arrays, numbers, booleans, null - use JSON representation + value = jsonValue.toString(); + } + + if (key.contains(L10N_KEY_SEPARATOR)) { + // Handle localized settings + String name = key.substring(0, key.indexOf(L10N_KEY_SEPARATOR)); + String lang = key.substring(key.indexOf(L10N_KEY_SEPARATOR) + L10N_KEY_SEPARATOR.length()); + return new Setting(name, lang, value); + } else { + return new Setting(key, value); + } + }) + .collect(Collectors.toSet()); + } + + /** + * Enum representing the types of operations that are performed on a bulk operation with settings. + * @implNote Although this is only meant for internal use, we use it in a public method (which needs to stay public). + * To avoid IDE warning about exposure, let's make it public, too. + */ + public enum Op { + UPDATED, + CREATED, + DELETED, + UNCHANGED; + + static JsonObjectBuilder convertToJson(Map operationalDetails) { + // Create a nice represenation of what happened as Json + JsonObjectBuilder jbo = Json.createObjectBuilder(); + JsonArrayBuilder created = Json.createArrayBuilder(); + JsonArrayBuilder updated = Json.createArrayBuilder(); + JsonArrayBuilder deleted = Json.createArrayBuilder(); + JsonArrayBuilder unchanged = Json.createArrayBuilder(); + + operationalDetails.forEach((setting, op) -> { + String name = convertToJsonKey(setting); + switch (op) { + case CREATED -> created.add(name); + case UPDATED -> updated.add(name); + case DELETED -> deleted.add(name); + case UNCHANGED -> unchanged.add(name); + } + }); + + return jbo + .add("created", created) + .add("updated", updated) + .add("deleted", deleted) + .add("unchanged", unchanged); + } + } + + /** + * Replaces all existing settings in the database with the provided set of new settings. + * This method performs the following actions: + * - Deletes any existing settings that are not present in the provided new settings. + * - Updates the content of existing settings that match the keys in the provided new settings. + * - Creates new settings that are not present in the database. + * + * If calling this method from within this class, make sure to use an EJB injected self-reference to it. + * Otherwise, the EJB container will not be able to provide a transaction as intended by {@code @Transactional}. + * + * @param newSettings the set of new settings to replace the existing ones. + * Each setting is uniquely identified by its name and language. + * Must not be null (it may be empty). + * @return a map tracking the operations performed on each setting. The map's keys + * are the settings involved, and the values are the types of operations + * performed (CREATED, UPDATED, DELETED). + * + * @implNote Must be a public method to ensure proper transaction management. + */ + @TransactionAttribute(TransactionAttributeType.REQUIRES_NEW) + public Map replaceAllSettings(Set newSettings) { + Objects.requireNonNull(newSettings, "The list of new settings cannot be null (it may be empty)."); + + // Get all existing settings as a map for O(1) lookup + List existingSettings = em.createNamedQuery("Setting.findAll", Setting.class).getResultList(); + Map existingByKey = existingSettings.stream() + .collect(Collectors.toMap( + setting -> setting.getName() + "|" + setting.getLang(), + Function.identity() + )); + + // Create map of new settings for O(1) lookup + Map newByKey = newSettings.stream() + .collect(Collectors.toMap( + setting -> setting.getName() + "|" + setting.getLang(), + Function.identity() + )); + + // Track operations for return value + Map opsTracking = new HashMap<>(); + + // Process existing settings + for (Map.Entry entry : existingByKey.entrySet()) { + String key = entry.getKey(); + Setting existingSetting = entry.getValue(); + + // Setting exists in DB but not in new set - delete it + if (!newByKey.containsKey(key)) { + em.remove(existingSetting); + opsTracking.put(existingSetting, Op.DELETED); + + // Setting exists in both - update with new values + } else { + Setting newSetting = newByKey.get(key); + if (existingSetting.getContent().equals(newSetting.getContent())) { + opsTracking.put(existingSetting, Op.UNCHANGED); + } else { + // We use the already managed entity and update it with the content of the new setting. + // (This means we don't need to call em.merge(), the ORM will track and execute it for us.) + existingSetting.setContent(newSetting.getContent()); + opsTracking.put(existingSetting, Op.UPDATED); + } + } + } + + // Process new settings - create those not in existing set + for (Map.Entry entry : newByKey.entrySet()) { + String key = entry.getKey(); + Setting newSetting = entry.getValue(); + + if (!existingByKey.containsKey(key)) { + // Setting is new - persist it + em.persist(newSetting); + opsTracking.put(newSetting, Op.CREATED); + } + // If it exists, it was already handled in the previous loop + } + + // Flush changes to ensure consistency before transaction is committed (will also ensure merge() is called). + em.flush(); + + return opsTracking; + } public Map getBaseMetadataLanguageMap(Map languageMap, boolean refresh) { @@ -1028,5 +1448,76 @@ public Set getConfiguredLanguages() { langs.addAll(configuredLocales.keySet()); return langs; } - + + public static String convertToJsonKey(Setting setting) { + return setting.getName() + (setting.getLang().isEmpty() ? "" : L10N_KEY_SEPARATOR + setting.getLang()); + } + + /** + * Validates the keys in the provided settings JSON object. + * This method checks if each key follows the required format and rules. + * If a key is invalid, it is added to the list of invalid keys. + * + * @param settings the JsonObject containing the keys to be validated + * @return a list of invalid keys as an unmodifiable list + */ + public static List validateKeys(JsonObject settings) { + Objects.requireNonNull(settings, "The settings object cannot be null."); + List invalidKeys = new ArrayList<>(); + for (String key : settings.keySet()) { + try { + // Case A: localized setting, validate setting and language + if (key.contains(L10N_KEY_SEPARATOR)) { + String name = key.substring(0, key.indexOf(L10N_KEY_SEPARATOR)); + String lang = key.substring(key.indexOf(L10N_KEY_SEPARATOR) + L10N_KEY_SEPARATOR.length()); + validateSettingName(name); + validateSettingLang(lang); + // Case B: Simple, non-localized setting name + } else { + validateSettingName(key); + } + } catch (SettingsValidationException sev) { + invalidKeys.add(key); + } + } + return Collections.unmodifiableList(invalidKeys); + } + + /** + * Validates the provided setting name to ensure it meets the required format. + * Throws an {@code SettingsValidationException} if the name is invalid, including cases + * where it contains a colon-separated suffix that is no longer supported. + * + * @param name The name of the setting to be validated. + * It must adhere to the allowable setting name format. + * Names with more than one colon, which may indicate deprecated suffix formats, are not allowed. + * @throws SettingsValidationException if the setting name is invalid. + */ + public static void validateSettingName(String name) { + if (SettingsServiceBean.Key.parse(name) == null) { + // If there is more than one colon, this may be someone trying to use the old suffix settings. + // Change the error message for that slightly. + if (name.replace(":","").length() < name.length() - 1) { + throw new SettingsValidationException("The name of the setting may not have a colon separated suffix since Dataverse 6.8. Please update your scripts."); + } + throw new SettingsValidationException("The name of the setting is invalid."); + } + } + + /** + * Validates the provided language code to ensure it adheres to the ISO 639-1 format. + * This method checks that the language code is not null, has a length of 2 characters, + * and exists within the list of valid ISO 639-1 language codes. If the validation + * fails, an {@code SettingsValidationException} is thrown. + * + * @param lang the language code to be validated. It must be a non-null, + * 2-character string representing a valid ISO 639-1 language code. + * @throws SettingsValidationException if the language code is invalid. + */ + public static void validateSettingLang(String lang) { + if (lang == null || lang.length() != 2 || !Arrays.asList(Locale.getISOLanguages()).contains(lang)) { + throw new SettingsValidationException("The language '" + lang + "' is not a valid ISO 639-1 language code."); + } + } + } diff --git a/src/main/java/edu/harvard/iq/dataverse/settings/SettingsValidationException.java b/src/main/java/edu/harvard/iq/dataverse/settings/SettingsValidationException.java new file mode 100644 index 00000000000..e02e3234675 --- /dev/null +++ b/src/main/java/edu/harvard/iq/dataverse/settings/SettingsValidationException.java @@ -0,0 +1,10 @@ +package edu.harvard.iq.dataverse.settings; + +import jakarta.ejb.ApplicationException; + +@ApplicationException(rollback = true) +public class SettingsValidationException extends RuntimeException { + public SettingsValidationException(String message) { + super(message); + } +} diff --git a/src/main/java/edu/harvard/iq/dataverse/util/CSLUtil.java b/src/main/java/edu/harvard/iq/dataverse/util/CSLUtil.java index fe9e00bd837..213737ffeeb 100644 --- a/src/main/java/edu/harvard/iq/dataverse/util/CSLUtil.java +++ b/src/main/java/edu/harvard/iq/dataverse/util/CSLUtil.java @@ -87,7 +87,7 @@ public static List getSupportedStyles(String localeCode) { * Adapted from private retrieveStyle method in de.undercouch.citeproc.CSL * Retrieves a CSL style from the classpath. For example, if the given name is * ieee this method will load the file /ieee.csl - * + * * @param styleName the style's name * @return the serialized XML representation of the style * @throws IOException if the style could not be loaded @@ -119,8 +119,9 @@ public static String getCitationFormat(String styleName) throws IOException { private static String[] getCommonStyles() { if (commonStyles == null) { - commonStyles = JvmSettings.CSL_COMMON_STYLES.lookupOptional().orElse("chicago-author-date, ieee") - .split("\\s*,\\s*"); + commonStyles = ListSplitUtil.split( + JvmSettings.CSL_COMMON_STYLES.lookupOptional().orElse("chicago-author-date, ieee") + ).toArray(new String[0]); } return commonStyles; } diff --git a/src/main/java/edu/harvard/iq/dataverse/util/ListSplitUtil.java b/src/main/java/edu/harvard/iq/dataverse/util/ListSplitUtil.java new file mode 100644 index 00000000000..793eef1db7b --- /dev/null +++ b/src/main/java/edu/harvard/iq/dataverse/util/ListSplitUtil.java @@ -0,0 +1,45 @@ +package edu.harvard.iq.dataverse.util; + +import java.util.Arrays; +import java.util.Collections; +import java.util.List; +import java.util.Set; +import java.util.regex.Pattern; + +/** + * Helpers for simple admin settings that accept comma-separated lists (origins, methods, headers, etc.). + *

+ * Behavior: + * - Leading/trailing whitespace of the whole input is ignored. + * - Whitespace immediately around commas is ignored ("GET, POST" == "GET,POST"). + * - Tokens are otherwise preserved exactly as typed (no quote stripping, no escape processing). + * Not a full CSV parser: embedded commas, quoted fields with separators, and newlines inside tokens are NOT supported. + */ +public final class ListSplitUtil { + /** Split on commas, trimming any adjacent to comma whitespace. */ + private static final Pattern SPLIT = Pattern.compile("\\s*,\\s*"); + + /** + * Split a comma-separated string into tokens preserving user input (beyond removing cosmetic + * whitespace around commas and overall leading/trailing whitespace). Returns an empty list for + * null or blank input. + */ + public static List split(final String rawCsv) { + if (rawCsv == null) { + return Collections.emptyList(); + } + final String trimmedCsv = rawCsv.trim(); + if (trimmedCsv.isEmpty()) { + return Collections.emptyList(); + } + return Arrays.asList(SPLIT.split(trimmedCsv)); + } + + /** Convenience: split into a lowercase set. */ + public static Set splitToLowerCaseSet(final String rawCsv) { + if (rawCsv == null || rawCsv.trim().isEmpty()) { + return Collections.emptySet(); + } + return Set.copyOf(split(rawCsv.toLowerCase())); + } +} diff --git a/src/main/java/edu/harvard/iq/dataverse/util/SystemConfig.java b/src/main/java/edu/harvard/iq/dataverse/util/SystemConfig.java index 69f9262ab5b..c1d61378f42 100644 --- a/src/main/java/edu/harvard/iq/dataverse/util/SystemConfig.java +++ b/src/main/java/edu/harvard/iq/dataverse/util/SystemConfig.java @@ -11,6 +11,7 @@ import edu.harvard.iq.dataverse.settings.JvmSettings; import edu.harvard.iq.dataverse.settings.SettingsServiceBean; import edu.harvard.iq.dataverse.validation.PasswordValidatorUtil; +import jakarta.json.stream.JsonParsingException; import org.passay.CharacterRule; import jakarta.ejb.EJB; @@ -27,6 +28,7 @@ import java.net.UnknownHostException; import java.time.Year; import java.util.Arrays; +import java.util.Collections; import java.util.HashMap; import java.util.Iterator; import java.util.List; @@ -488,49 +490,110 @@ public Integer getSearchHighlightFragmentSize() { } return null; } - - public long getTabularIngestSizeLimit() { - // This method will return the blanket ingestable size limit, if - // set on the system. I.e., the universal limit that applies to all - // tabular ingests, regardless of fromat: - - String limitEntry = settingsService.getValueForKey(SettingsServiceBean.Key.TabularIngestSizeLimit); - + + /** + * The default key used to identify tabular ingest size limits. + * This value represents the standard or fallback configuration. + * For any other valid format strings, see implementations of {@code TabularDataFileReader.getFormatName()}. + */ + public static final String TABULAR_INGEST_SIZE_LIMITS_DEFAULT_KEY = "default"; + + /** + * Retrieves the tabular ingest size limits based on the system configuration. + * The size limits can be defined as a JSON object with format-specific limits, a single numeric value + * applied to all formats, or might not exist, in which case the default limit is applied. + * + * Note that the format names in the configuration will be transformed to lowercase for user convenience + * of how people like to write their formats best. + * + * If the configuration contains invalid data (e.g., unparsable JSON or non-numeric values), + * all tabular ingest operations are disabled by setting size limits to 0. + * + * TODO: At some later point, if and when the DB lookups or JSON parsing takes a toll to heavy to bear, + * we may introduce a caching singleton for these. (With TTL or using events to invalidate on update.) + * + * @return a map where the keys represent format names or a default key, and the values represent the maximum allowed size for each format. + */ + public Map getTabularIngestSizeLimits() { + String limitEntry = settingsService.getValueForKey(SettingsServiceBean.Key.TabularIngestSizeLimit); if (limitEntry != null) { - try { - Long sizeOption = Long.valueOf(limitEntry); - return sizeOption; - } catch (NumberFormatException nfe) { - logger.warning("Invalid value for TabularIngestSizeLimit option? - " + limitEntry); + // Case A: the setting is using JSON to support multiple formats + if (limitEntry.trim().startsWith("{")) { + try { + JsonObject limits = Json.createReader(new StringReader(limitEntry)).readObject(); + + Map limitsMap = new HashMap<>(); + // We add the default in case the JSON does not contain the default (which is optional). + limitsMap.put(TABULAR_INGEST_SIZE_LIMITS_DEFAULT_KEY, -1L); + + for (String formatName : limits.keySet()) { + // We deliberatly do not validate the formatNames here for backward compatibility. + // But we transform to lowercase here, so the casing doesn't matter for lookups. + String lowercaseFormatName = formatName.toLowerCase(); + + try { + Long sizeOption = Long.valueOf(limits.getString(formatName)); + limitsMap.put(lowercaseFormatName, sizeOption); + } catch (ClassCastException cce) { + logger.warning("Could not convert " + SettingsServiceBean.Key.TabularIngestSizeLimit + " to long from JSON integer. You must provide the long number as string (use quotes) for format " + formatName); + logger.warning("Disabling all tabular ingest completely until fixed!"); + return Map.of(TABULAR_INGEST_SIZE_LIMITS_DEFAULT_KEY, 0L); + } catch (NumberFormatException nfe) { + logger.warning("Could not convert " + SettingsServiceBean.Key.TabularIngestSizeLimit + " to long for format " + formatName + " (not a number)"); + logger.warning("Disabling all tabular ingest completely until fixed!"); + return Map.of(TABULAR_INGEST_SIZE_LIMITS_DEFAULT_KEY, 0L); + } + } + + return Collections.unmodifiableMap(limitsMap); + } catch (JsonParsingException e) { + logger.warning("Invalid TabularIngestSizeLimit option found, cannot parse JSON: " + e.getMessage()); + logger.warning("Disabling all tabular ingest completely until fixed!"); + return Map.of(TABULAR_INGEST_SIZE_LIMITS_DEFAULT_KEY, 0L); + } + // Case B: It might be just a simple Long, providing a default for all formats. + } else { + try { + Long limit = Long.valueOf(limitEntry); + return Map.of(TABULAR_INGEST_SIZE_LIMITS_DEFAULT_KEY, limit); + } catch (NumberFormatException nfe) { + logger.warning("Could not convert " + SettingsServiceBean.Key.TabularIngestSizeLimit + " to long: " + nfe); + logger.warning("Disabling all tabular ingest completely until fixed!"); + return Map.of(TABULAR_INGEST_SIZE_LIMITS_DEFAULT_KEY, 0L); + } } } - // -1 means no limit is set; - // 0 on the other hand would mean that ingest is fully disabled for - // tabular data. - return -1; + + // Default is not to limit at all + return Map.of(TABULAR_INGEST_SIZE_LIMITS_DEFAULT_KEY, -1L); + } + + /** + * This method will return the blanket ingestable size limit, if set on the system. + * I.e., the universal limit that applies to all tabular ingests, regardless of fromat. + * @return -1 = unlimited if not set, 0 if disabled or invalid, some long number of bytes otherwise + */ + public long getTabularIngestSizeLimit() { + return getTabularIngestSizeLimits().get(TABULAR_INGEST_SIZE_LIMITS_DEFAULT_KEY); } + /** + * Retrieves the size limit for tabular data ingestion based on the provided format name. + * The format name will be converted to lowercase, making sure the casing doesn't matter. + * + * @param formatName The name of the format for which the size limit is requested + * See also implementations of {@code TabularDataFileReader.getFormatName()} for examples. + * @return The size limit in bytes for tabular data ingestion associated with the specified format name, + * or the default size limit if no format-specific limit is found or its name is invalid (null, blank, ...). + * -1 = unlimited if not set, 0 if disabled or invalid, some long number of bytes otherwise + */ public long getTabularIngestSizeLimit(String formatName) { - // This method returns the size limit set specifically for this format name, - // if available, otherwise - the blanket limit that applies to all tabular - // ingests regardless of a format. - - if (formatName == null || formatName.equals("")) { - return getTabularIngestSizeLimit(); - } - - String limitEntry = settingsService.get(SettingsServiceBean.Key.TabularIngestSizeLimit.toString() + ":" + formatName); - - if (limitEntry != null) { - try { - Long sizeOption = Long.valueOf(limitEntry); - return sizeOption; - } catch (NumberFormatException nfe) { - logger.warning("Invalid value for TabularIngestSizeLimit:" + formatName + "? - " + limitEntry ); - } + if (formatName != null && !formatName.isBlank()) { + // We convert to lowercase so it doesn't matter which variant someone uses in the JSON config + String convertedFormatName = formatName.toLowerCase(); + return getTabularIngestSizeLimits().getOrDefault(convertedFormatName, getTabularIngestSizeLimit()); } - - return getTabularIngestSizeLimit(); + return getTabularIngestSizeLimit(); } public boolean isOAIServerEnabled() { @@ -938,11 +1001,12 @@ public boolean isRsyncOnly(){ return false; } String uploadMethods = settingsService.getValueForKey(SettingsServiceBean.Key.UploadMethods); - if (uploadMethods==null){ + if (uploadMethods == null) { return false; - } else { - return Arrays.asList(uploadMethods.toLowerCase().split("\\s*,\\s*")).size() == 1 && uploadMethods.toLowerCase().equals(SystemConfig.FileUploadMethods.RSYNC.toString()); } + String normalizedUploadMethods = uploadMethods.toLowerCase(); + return ListSplitUtil.split(normalizedUploadMethods).size() == 1 + && normalizedUploadMethods.equals(SystemConfig.FileUploadMethods.RSYNC.toString()); } @Deprecated(forRemoval = true, since = "2024-07-07") @@ -972,18 +1036,16 @@ private Boolean getMethodAvailable(String method, boolean upload) { upload ? SettingsServiceBean.Key.UploadMethods : SettingsServiceBean.Key.DownloadMethods); if (methods == null) { return false; - } else { - return Arrays.asList(methods.toLowerCase().split("\\s*,\\s*")).contains(method); } + return ListSplitUtil.split(methods.toLowerCase()).contains(method); } public Integer getUploadMethodCount(){ String uploadMethods = settingsService.getValueForKey(SettingsServiceBean.Key.UploadMethods); - if (uploadMethods==null){ + if (uploadMethods == null) { return 0; - } else { - return Arrays.asList(uploadMethods.toLowerCase().split("\\s*,\\s*")).size(); - } + } + return ListSplitUtil.split(uploadMethods.toLowerCase()).size(); } public boolean isAllowCustomTerms() { diff --git a/src/main/java/edu/harvard/iq/dataverse/util/bagit/BagGenerator.java b/src/main/java/edu/harvard/iq/dataverse/util/bagit/BagGenerator.java index f6b12d5f904..f24ebdb8655 100644 --- a/src/main/java/edu/harvard/iq/dataverse/util/bagit/BagGenerator.java +++ b/src/main/java/edu/harvard/iq/dataverse/util/bagit/BagGenerator.java @@ -75,6 +75,7 @@ import edu.harvard.iq.dataverse.DataFile.ChecksumType; import edu.harvard.iq.dataverse.pidproviders.PidUtil; import edu.harvard.iq.dataverse.settings.JvmSettings; +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.BagGeneratorThreads; import edu.harvard.iq.dataverse.util.json.JsonLDTerm; import java.util.Optional; @@ -120,7 +121,7 @@ public class BagGenerator { private boolean usetemp = false; private int numConnections = 8; - public static final String BAG_GENERATOR_THREADS = ":BagGeneratorThreads"; + public static final String BAG_GENERATOR_THREADS = BagGeneratorThreads.toString(); private OREMap oremap; diff --git a/src/main/java/edu/harvard/iq/dataverse/util/bagit/BagValidator.java b/src/main/java/edu/harvard/iq/dataverse/util/bagit/BagValidator.java index a9052bf4c80..85a2f3f09ff 100644 --- a/src/main/java/edu/harvard/iq/dataverse/util/bagit/BagValidator.java +++ b/src/main/java/edu/harvard/iq/dataverse/util/bagit/BagValidator.java @@ -1,5 +1,8 @@ package edu.harvard.iq.dataverse.util.bagit; +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.BagValidatorJobPoolSize; +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.BagValidatorJobWaitInterval; +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.BagValidatorMaxErrors; import edu.harvard.iq.dataverse.util.BundleUtil; import edu.harvard.iq.dataverse.util.bagit.BagValidation.FileValidationResult; import edu.harvard.iq.dataverse.util.bagit.ManifestReader.ManifestChecksum; @@ -27,9 +30,9 @@ public class BagValidator { private static final Logger logger = Logger.getLogger(BagValidator.class.getCanonicalName()); public static enum BagValidatorSettings { - JOB_POOL_SIZE(":BagValidatorJobPoolSize", 4), - MAX_ERRORS(":BagValidatorMaxErrors", 5), - JOB_WAIT_INTERVAL(":BagValidatorJobWaitInterval", 10); + JOB_POOL_SIZE(BagValidatorJobPoolSize.toString(), 4), + MAX_ERRORS(BagValidatorMaxErrors.toString(), 5), + JOB_WAIT_INTERVAL(BagValidatorJobWaitInterval.toString(), 10); private String settingsKey; private Integer defaultValue; diff --git a/src/main/java/edu/harvard/iq/dataverse/util/file/BagItFileHandlerFactory.java b/src/main/java/edu/harvard/iq/dataverse/util/file/BagItFileHandlerFactory.java index 4b0263030dc..1d1b6b5b7aa 100644 --- a/src/main/java/edu/harvard/iq/dataverse/util/file/BagItFileHandlerFactory.java +++ b/src/main/java/edu/harvard/iq/dataverse/util/file/BagItFileHandlerFactory.java @@ -1,6 +1,7 @@ package edu.harvard.iq.dataverse.util.file; import edu.harvard.iq.dataverse.settings.SettingsServiceBean; +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.BagItHandlerEnabled; import edu.harvard.iq.dataverse.util.bagit.BagValidator; import edu.harvard.iq.dataverse.util.bagit.BagValidator.BagValidatorSettings; import edu.harvard.iq.dataverse.util.bagit.ManifestReader; @@ -25,7 +26,7 @@ public class BagItFileHandlerFactory implements Serializable { private static final Logger logger = Logger.getLogger(BagItFileHandlerFactory.class.getCanonicalName()); - public static final String BAGIT_HANDLER_ENABLED_SETTING = ":BagItHandlerEnabled"; + public static final String BAGIT_HANDLER_ENABLED_SETTING = BagItHandlerEnabled.toString(); @EJB private SettingsServiceBean settingsService; diff --git a/src/main/java/edu/harvard/iq/dataverse/util/json/JsonPrinter.java b/src/main/java/edu/harvard/iq/dataverse/util/json/JsonPrinter.java index 46a05fc93f2..654154c5b64 100644 --- a/src/main/java/edu/harvard/iq/dataverse/util/json/JsonPrinter.java +++ b/src/main/java/edu/harvard/iq/dataverse/util/json/JsonPrinter.java @@ -1719,7 +1719,17 @@ public static JsonArrayBuilder json(List notifications, Authen return notificationsArray; } + + public static JsonObjectBuilder jsonLanguage(String locale, String title) { + // returns a single metadata language entry + return jsonObjectBuilder().add("locale", locale).add("title", title); + } + public static JsonArrayBuilder jsonLanguage(Map langMap) { + // returns an array of metadatalanguages + return Json.createArrayBuilder(langMap.entrySet().stream().map(entry -> jsonLanguage(entry.getKey(), entry.getValue())).toList()); + } + public static JsonArrayBuilder jsonDatasetVersionSummaries(List summaries) { JsonArrayBuilder arrayBuilder = Json.createArrayBuilder(); summaries.stream() diff --git a/src/main/java/edu/harvard/iq/dataverse/util/json/JsonUtil.java b/src/main/java/edu/harvard/iq/dataverse/util/json/JsonUtil.java index 72a1cd2e1eb..737d67d8245 100644 --- a/src/main/java/edu/harvard/iq/dataverse/util/json/JsonUtil.java +++ b/src/main/java/edu/harvard/iq/dataverse/util/json/JsonUtil.java @@ -8,12 +8,14 @@ import java.io.StringWriter; import java.util.HashMap; import java.util.Map; +import java.util.Objects; import java.util.logging.Logger; import jakarta.json.Json; import jakarta.json.JsonArray; import jakarta.json.JsonException; import jakarta.json.JsonObject; import jakarta.json.JsonReader; +import jakarta.json.JsonValue; import jakarta.json.JsonWriter; import jakarta.json.JsonWriterFactory; import jakarta.json.stream.JsonGenerator; @@ -131,4 +133,34 @@ public static JsonArray getJsonArray(String serializedJson) { } } } + + + /** + * Parses a serialized JSON string and returns it as a JsonValue. + * The returned JsonValue can be a JsonObject, JsonArray, or another type + * based on the structure of the provided serialized JSON string. + * This method closes its resources but does not catch any exceptions. + * + * @param serializedJson The JSON content serialized as a String + * @return The parsed content as a JsonValue which could be a JsonObject, JsonArray, or another JsonValue type + * @throws JsonException If an error occurs during parsing (null, invalid JSON, not trimmed, etc.) + */ + public static JsonValue getJsonValue(String serializedJson) { + if (serializedJson == null) { + throw new JsonException("The serialized JSON string cannot be null."); + } + + try (StringReader rdr = new StringReader(serializedJson)) { + try (JsonReader jsonReader = Json.createReader(rdr)) { + JsonValue jsonValue = jsonReader.read(); + if (jsonValue.getValueType() == JsonValue.ValueType.OBJECT) { + return jsonValue.asJsonObject(); + } else if (jsonValue.getValueType() == JsonValue.ValueType.ARRAY) { + return jsonValue.asJsonArray(); + } else { + return jsonValue; + } + } + } + } } diff --git a/src/main/java/edu/harvard/iq/dataverse/workflow/internalspi/LDNAnnounceDatasetVersionStep.java b/src/main/java/edu/harvard/iq/dataverse/workflow/internalspi/LDNAnnounceDatasetVersionStep.java index 124eea801d9..49ca77573da 100644 --- a/src/main/java/edu/harvard/iq/dataverse/workflow/internalspi/LDNAnnounceDatasetVersionStep.java +++ b/src/main/java/edu/harvard/iq/dataverse/workflow/internalspi/LDNAnnounceDatasetVersionStep.java @@ -5,6 +5,9 @@ import edu.harvard.iq.dataverse.DatasetFieldType; import edu.harvard.iq.dataverse.DatasetVersion; import edu.harvard.iq.dataverse.branding.BrandingUtil; +import edu.harvard.iq.dataverse.util.ListSplitUtil; +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.LDNAnnounceRequiredFields; +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.LDNTarget; import edu.harvard.iq.dataverse.util.SystemConfig; import edu.harvard.iq.dataverse.util.bagit.OREMap; import edu.harvard.iq.dataverse.util.json.JsonLDTerm; @@ -46,14 +49,14 @@ * anounce new dataset versions to the Harvard DASH preprint repository so that * a DASH admin can create a backlink for any dataset versions that reference a * DASH deposit or a paper with a DOI where DASH has a preprint copy. - * + * * @author qqmyers */ public class LDNAnnounceDatasetVersionStep implements WorkflowStep { private static final Logger logger = Logger.getLogger(LDNAnnounceDatasetVersionStep.class.getName()); - private static final String REQUIRED_FIELDS = ":LDNAnnounceRequiredFields"; - private static final String LDN_TARGET = ":LDNTarget"; + private static final String REQUIRED_FIELDS = LDNAnnounceRequiredFields.toString(); + private static final String LDN_TARGET = LDNTarget.toString(); private static final String RELATED_PUBLICATION = "publication"; JsonLDTerm publicationIDType = null; @@ -74,7 +77,7 @@ public WorkflowStepResult run(WorkflowContext context) { CloseableHttpClient client = HttpClients.createDefault(); // build method - + HttpPost announcement; try { announcement = buildAnnouncement(false, context, target); @@ -124,8 +127,7 @@ HttpPost buildAnnouncement(boolean qb, WorkflowContext ctxt, JsonObject target) DatasetVersion dv = ctxt.getDataset().getReleasedVersion(); List dvf = dv.getDatasetFields(); Map fields = new HashMap(); - String[] requiredFields = ((String) ctxt.getSettings().getOrDefault(REQUIRED_FIELDS, "")).split(",\\s*"); - for (String field : requiredFields) { + for (String field : ListSplitUtil.split((String) ctxt.getSettings().getOrDefault(REQUIRED_FIELDS, ""))) { fields.put(field, null); } Set reqFields = fields.keySet(); diff --git a/src/main/java/propertyFiles/Bundle.properties b/src/main/java/propertyFiles/Bundle.properties index 3254c26ed22..e93280cca2d 100644 --- a/src/main/java/propertyFiles/Bundle.properties +++ b/src/main/java/propertyFiles/Bundle.properties @@ -1251,6 +1251,7 @@ dataverse.permissionsFiles.files.invalidMsg=There are no restricted files in thi dataverse.permissionsFiles.files.requested=Requested Files dataverse.permissionsFiles.files.selected=Selecting {0} of {1} {2} dataverse.permissionsFiles.files.includeDeleted=Include Deleted Files +dataverse.permissionsFiles.files.draftUnpublished=Draft/Unpublished dataverse.permissionsFiles.viewRemoveDialog.header=File Access dataverse.permissionsFiles.viewRemoveDialog.removeBtn=Remove Access dataverse.permissionsFiles.viewRemoveDialog.removeBtn.confirmation=Are you sure you want to remove access to this file? Once access has been removed, the user or group will no longer be able to download this file. diff --git a/src/main/resources/db/migration/V6.8.0.1.sql b/src/main/resources/db/migration/V6.8.0.1.sql new file mode 100644 index 00000000000..8e810270b06 --- /dev/null +++ b/src/main/resources/db/migration/V6.8.0.1.sql @@ -0,0 +1,97 @@ +-- Migrates the old database setting to their valid and aligned successors. #11639 +-- 1. ":TabularIngestSizeLimit" database setting used format suffixes, move to a JSON-based approach +-- 2. (see below) "BuiltinUsers.KEY" was never aligned with any of the other settings names. +DO $$ + DECLARE + base_setting_content TEXT; + format_settings_cursor CURSOR FOR + SELECT name, content + FROM setting + WHERE name LIKE ':TabularIngestSizeLimit:%' + AND lang IS NULL; + format_record RECORD; + format_name TEXT; + format_value BIGINT; + json_object JSONB := '{}'; + has_format_settings BOOLEAN := FALSE; + warning_message TEXT; + BEGIN + -- Check if there are any format-specific settings + SELECT EXISTS( + SELECT 1 FROM setting + WHERE name LIKE ':TabularIngestSizeLimit:%' + AND lang IS NULL + ) INTO has_format_settings; + + -- Only proceed if we have format-specific settings + IF NOT has_format_settings THEN + RAISE NOTICE 'No format-specific TabularIngestSizeLimit settings found. Skipping migration.'; + RETURN; + END IF; + + -- Get the base setting (without format suffix) if it exists + SELECT content INTO base_setting_content + FROM setting + WHERE name = ':TabularIngestSizeLimit' + AND lang IS NULL; + + -- Add base setting to JSON object if it exists + IF base_setting_content IS NOT NULL THEN + -- Validate that base setting is numeric + BEGIN + format_value := base_setting_content::BIGINT; + json_object := json_object || jsonb_build_object('default', format_value); + EXCEPTION WHEN invalid_text_representation THEN + RAISE WARNING 'Base TabularIngestSizeLimit setting contains non-numeric value: %. Setting it to 0 (disabling ingest!).', base_setting_content; + json_object := json_object || jsonb_build_object('default', 0); + END; + END IF; + + -- Process format-specific settings + FOR format_record IN format_settings_cursor LOOP + -- Extract format name (everything after ":TabularIngestSizeLimit:") + format_name := substring(format_record.name from ':TabularIngestSizeLimit:(.*)'); + + -- Validate and convert the content to numeric + BEGIN + format_value := format_record.content::BIGINT; + json_object := json_object || jsonb_build_object(format_name, format_value); + EXCEPTION WHEN invalid_text_representation THEN + warning_message := format('Format-specific TabularIngestSizeLimit setting %s contains non-numeric value: %s. Setting it to 0 (disabling ingest!).', + format_record.name, format_record.content); + RAISE WARNING '%', warning_message; + json_object := json_object || jsonb_build_object(format_name, 0); + END; + END LOOP; + + -- Insert or update the new JSON-based setting + INSERT INTO setting (name, content, lang) + VALUES (':TabularIngestSizeLimit', json_object::TEXT, NULL) + ON CONFLICT (name) WHERE lang IS NULL + DO UPDATE SET content = EXCLUDED.content; + + -- Delete all format-specific settings + DELETE FROM setting + WHERE name LIKE ':TabularIngestSizeLimit:%' + AND lang IS NULL; + + RAISE NOTICE 'Successfully migrated TabularIngestSizeLimit settings to JSON format: %', json_object::TEXT; + END $$; + +-- 2. Migrate BuiltinUsers.KEY to the new setting name +DO $$ + BEGIN + IF EXISTS (SELECT 1 FROM setting WHERE name = 'BuiltinUsers.KEY') THEN + INSERT INTO setting (name, lang, content) VALUES (':BuiltinUsersKey', NULL, (SELECT content FROM setting WHERE name = 'BuiltinUsers.KEY')); + DELETE FROM setting WHERE name = 'BuiltinUsers.KEY'; + END IF; + END $$; + +-- 3. Migrate WorkflowsAdmin#IP_WHITELIST_KEY to the new setting name +DO $$ + BEGIN + IF EXISTS (SELECT 1 FROM setting WHERE name = 'WorkflowsAdmin#IP_WHITELIST_KEY') THEN + INSERT INTO setting (name, lang, content) VALUES (':WorkflowsAdminIpWhitelist', NULL, (SELECT content FROM setting WHERE name = 'WorkflowsAdmin#IP_WHITELIST_KEY')); + DELETE FROM setting WHERE name = 'WorkflowsAdmin#IP_WHITELIST_KEY'; + END IF; + END $$; diff --git a/src/main/resources/db/migration/V6.8.0.2.sql b/src/main/resources/db/migration/V6.8.0.2.sql new file mode 100644 index 00000000000..a7493359ad5 --- /dev/null +++ b/src/main/resources/db/migration/V6.8.0.2.sql @@ -0,0 +1,42 @@ +-- Update Setting table structure for changes from #11639 +-- 1. Change column types from TEXT to VARCHAR for better performance +-- 2. Update lang column to use empty string default instead of NULL (avoid non-unique pairs) +-- 3. Add NOT NULL constraints and unique constraint for name+lang pairs + +DO $$ +BEGIN + -- These database constraints were added with Dataverse 4.15, but they had no representation in the code, + -- not even a comment about their existence. See also Flyway script V4.16.0.1__5303-addColumn-to-settingTable.sql. + -- We are going to replace them with the new design here, using an empty lang as default. + -- Before, lang could be more or less anything. Now we do imply restrictions on validation within the API. + ALTER TABLE setting DROP CONSTRAINT IF EXISTS non_empty_lang; + DROP INDEX IF EXISTS unique_settings; + + -- Now, update any existing NULL lang values to empty string (we cannot do this before lifting the restrictions) + -- This also needs to be done before we try to alter the table to not allow NULL for setting.lang + UPDATE setting SET lang = '' WHERE lang IS NULL; + + -- Only alter columns if they need to be changed + -- (Note: Postgres doesn't support IF NOT EXISTS for ALTER COLUMN or ADD CONSTRAINT, so we need conditional logic) + IF EXISTS (SELECT 1 FROM information_schema.columns + WHERE table_name = 'setting' AND column_name = 'name' + AND (data_type = 'text' OR is_nullable = 'YES')) THEN + ALTER TABLE setting ALTER COLUMN name TYPE VARCHAR(255); + ALTER TABLE setting ALTER COLUMN name SET NOT NULL; + END IF; + + IF EXISTS (SELECT 1 FROM information_schema.columns + WHERE table_name = 'setting' AND column_name = 'lang' + AND (data_type = 'text' OR is_nullable = 'YES')) THEN + ALTER TABLE setting ALTER COLUMN lang TYPE VARCHAR(10); + ALTER TABLE setting ALTER COLUMN lang SET NOT NULL; + ALTER TABLE setting ALTER COLUMN lang SET DEFAULT ''; + END IF; + + IF NOT EXISTS (SELECT 1 FROM information_schema.table_constraints + WHERE table_name = 'setting' + AND constraint_name = 'uc_setting_name_lang' + AND constraint_type = 'UNIQUE') THEN + ALTER TABLE setting ADD CONSTRAINT uc_setting_name_lang UNIQUE (name, lang); + END IF; +END $$; diff --git a/src/main/webapp/permissions-manage-files.xhtml b/src/main/webapp/permissions-manage-files.xhtml index 5f450c4ae63..cc04af6c022 100644 --- a/src/main/webapp/permissions-manage-files.xhtml +++ b/src/main/webapp/permissions-manage-files.xhtml @@ -387,6 +387,7 @@ + diff --git a/src/test/java/edu/harvard/iq/dataverse/EditDatafilesPageTest.java b/src/test/java/edu/harvard/iq/dataverse/EditDatafilesPageTest.java new file mode 100644 index 00000000000..11578b71f0e --- /dev/null +++ b/src/test/java/edu/harvard/iq/dataverse/EditDatafilesPageTest.java @@ -0,0 +1,64 @@ +package edu.harvard.iq.dataverse; + +import edu.harvard.iq.dataverse.util.SystemConfig; +import org.junit.jupiter.api.Test; +import org.mockito.InjectMocks; +import org.mockito.Mock; +import org.mockito.MockitoAnnotations; + +import java.util.HashMap; +import java.util.Map; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertTrue; +import static org.mockito.Mockito.when; + +class EditDatafilesPageTest { + + @InjectMocks + private EditDatafilesPage editDatafilesPage; + + @Mock + private SystemConfig systemConfig; + + public EditDatafilesPageTest() { + MockitoAnnotations.openMocks(this); + } + + @Test + void testPopulateHumanPerFormatTabularLimits_WithEmptyLimits() { + Map tabularLimits = new HashMap<>(); + when(systemConfig.getTabularIngestSizeLimits()).thenReturn(tabularLimits); + + String result = editDatafilesPage.populateHumanPerFormatTabularLimits(); + + assertEquals("", result, "Expected no formatted limits when the map is empty"); + } + + @Test + void testPopulateHumanPerFormatTabularLimits_WithNonDefaultLimits() { + Map tabularLimits = new HashMap<>(); + tabularLimits.put("csv", 10485760L); // 10MB + tabularLimits.put("tsv", 5242880L); // 5MB + when(systemConfig.getTabularIngestSizeLimits()).thenReturn(tabularLimits); + + String result = editDatafilesPage.populateHumanPerFormatTabularLimits(); + + assertTrue(result.contains("csv: 10.0 MB"), "Expected CSV limit in human-readable format, but got: " + result); + assertTrue(result.contains("tsv: 5.0 MB"), "Expected TSV limit in human-readable format, but got: " + result); + } + + @Test + void testPopulateHumanPerFormatTabularLimits_WithDefaultKey() { + Map tabularLimits = new HashMap<>(); + tabularLimits.put(SystemConfig.TABULAR_INGEST_SIZE_LIMITS_DEFAULT_KEY, 2097152L); // 2MB + tabularLimits.put("csv", 10485760L); // 10MB + when(systemConfig.getTabularIngestSizeLimits()).thenReturn(tabularLimits); + + String result = editDatafilesPage.populateHumanPerFormatTabularLimits(); + + assertTrue(result.contains("csv: 10.0 MB"), "Expected CSV limit in human-readable format, but got: " + result); + assertFalse(result.contains("default"), "Default key should be excluded from the output"); + } +} \ No newline at end of file diff --git a/src/test/java/edu/harvard/iq/dataverse/api/AdminIT.java b/src/test/java/edu/harvard/iq/dataverse/api/AdminIT.java index b48c5507a54..6f3ffaa83b8 100644 --- a/src/test/java/edu/harvard/iq/dataverse/api/AdminIT.java +++ b/src/test/java/edu/harvard/iq/dataverse/api/AdminIT.java @@ -1,38 +1,48 @@ package edu.harvard.iq.dataverse.api; -import io.restassured.RestAssured; -import io.restassured.path.json.JsonPath; -import io.restassured.response.Response; import edu.harvard.iq.dataverse.DataFile; import edu.harvard.iq.dataverse.authorization.providers.builtin.BuiltinAuthenticationProvider; import edu.harvard.iq.dataverse.authorization.providers.oauth2.impl.GitHubOAuth2AP; import edu.harvard.iq.dataverse.authorization.providers.oauth2.impl.OrcidOAuth2AP; import edu.harvard.iq.dataverse.settings.SettingsServiceBean; - -import java.io.IOException; -import java.nio.file.Files; -import java.nio.file.Paths; -import java.util.ArrayList; -import java.util.HashMap; -import java.util.List; - +import io.restassured.RestAssured; +import io.restassured.path.json.JsonPath; +import io.restassured.response.Response; import jakarta.json.Json; import jakarta.json.JsonArray; +import jakarta.json.JsonObject; +import org.junit.jupiter.api.AfterAll; +import org.junit.jupiter.api.Assumptions; +import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Disabled; +import org.junit.jupiter.api.Nested; import org.junit.jupiter.api.Test; -import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.params.ParameterizedTest; import org.junit.jupiter.params.provider.ValueSource; - - +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.Paths; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; import java.util.Map; import java.util.UUID; import java.util.logging.Logger; -import static jakarta.ws.rs.core.Response.Status.*; -import static org.hamcrest.CoreMatchers.*; +import static io.restassured.RestAssured.given; +import static jakarta.ws.rs.core.Response.Status.BAD_REQUEST; +import static jakarta.ws.rs.core.Response.Status.CREATED; +import static jakarta.ws.rs.core.Response.Status.FORBIDDEN; +import static jakarta.ws.rs.core.Response.Status.INTERNAL_SERVER_ERROR; +import static jakarta.ws.rs.core.Response.Status.NOT_FOUND; +import static jakarta.ws.rs.core.Response.Status.OK; +import static jakarta.ws.rs.core.Response.Status.UNAUTHORIZED; +import static org.hamcrest.CoreMatchers.containsString; +import static org.hamcrest.CoreMatchers.equalTo; +import static org.hamcrest.CoreMatchers.notNullValue; import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertFalse; import static org.junit.jupiter.api.Assertions.assertTrue; public class AdminIT { @@ -45,7 +55,141 @@ public class AdminIT { public static void setUp() { RestAssured.baseURI = UtilIT.getRestAssuredBaseUri(); } - + + @Nested + class SettingsAPI { + + static final SettingsServiceBean.Key harmlessSetting = SettingsServiceBean.Key.InstallationName; + static final String harmlessValue = "Test Instance Name"; + static final String language = "fr"; + static final String harmlessL10nValue = "Nom de l'instance de test"; + + @AfterAll + static void destroy() { + // No leftover settings after breaking tests! + UtilIT.deleteSetting(harmlessSetting); + UtilIT.deleteSetting(harmlessSetting, language); + } + + @Test + void testSettingsRoundTrip() { + Assumptions.assumeTrue(UtilIT.getSetting(harmlessSetting).statusCode() == NOT_FOUND.getStatusCode(), "Harmless setting should not exist yet."); + Assumptions.assumeTrue(UtilIT.getSetting(harmlessSetting, language).statusCode() == NOT_FOUND.getStatusCode(), "Harmless localized setting should not exist yet."); + + // Step 0: Add a localized setting so we can make sure the put all can cope with that, too. + UtilIT.setSetting(harmlessSetting, harmlessL10nValue, language); + + // Step 1: Get current settings state + Response getResponse = UtilIT.getSettings(); + + getResponse.then() + .assertThat() + .statusCode(OK.getStatusCode()) + .contentType("application/json") + .body("status", equalTo("OK")) + .body("data.'"+harmlessSetting+"/lang/"+language+"'", equalTo(harmlessL10nValue)); + + // Store original settings as JsonObject for later restoration + JsonObject originalSettings = Json.createReader(getResponse.body().asInputStream()) + .readObject() + .getJsonObject("data"); + + // Step 2: Set our harmless test setting using UtilIT + Response setResponse = UtilIT.setSetting(harmlessSetting.toString(), harmlessValue); + setResponse.then() + .assertThat() + .statusCode(OK.getStatusCode()); + + // Step 3: Verify the harmless setting was set + Response verifySetResponse = UtilIT.getSetting(harmlessSetting); + + verifySetResponse.then() + .assertThat() + .statusCode(OK.getStatusCode()) + .body("data.message", equalTo(harmlessValue)); + + // Step 4: Put back the original settings (this is what we're testing) + Response putResponse = UtilIT.setSettings(originalSettings.toString()); + + putResponse.then() + .assertThat() + .statusCode(OK.getStatusCode()) + .body("status", equalTo("OK")) + .body("message.message", containsString("successfully updated")); + + // Step 5: Verify the harmless setting is gone (restored to original state) + Response verifyRestoredResponse = UtilIT.getSetting(harmlessSetting); + + verifyRestoredResponse.then() + .assertThat() + .statusCode(NOT_FOUND.getStatusCode()); // Should not exist anymore + + // Step 6: Verify overall settings state matches original + Response finalGetResponse = UtilIT.getSettings(); + + finalGetResponse.then() + .assertThat() + .statusCode(OK.getStatusCode()); + + // Store original settings as JsonObject for later restoration + JsonObject finalSettings = Json.createReader(getResponse.body().asInputStream()) + .readObject() + .getJsonObject("data"); + + // Verify the settings are back to original state (our test setting should be absent) + assertFalse(finalSettings.containsKey(harmlessSetting.toString()), "Harmless setting should not exist in restored settings"); + + // Cleanup: delete the localized setting + UtilIT.deleteSetting(harmlessSetting, language); + } + + @Test + void testGetAllSettingsWithLocalization() { + int statusCode = UtilIT.getSetting(harmlessSetting, language).statusCode(); + Assumptions.assumeTrue(statusCode == NOT_FOUND.getStatusCode(), "Harmless localized setting should not exist yet. Status Code: " + statusCode); + + // Given + UtilIT.setSetting(harmlessSetting, harmlessL10nValue, language); + + // When + Response getResponse = UtilIT.getSettings(); + + // Then + getResponse.then() + .assertThat() + .statusCode(OK.getStatusCode()) + .contentType("application/json") + .body("status", equalTo("OK")) + .body("data.'"+harmlessSetting+"/lang/"+language+"'", equalTo(harmlessL10nValue)); + + // Cleanup + UtilIT.deleteSetting(harmlessSetting, language); + } + + @Test + void testPutAllSettingsWithEmptyJson() { + // Test error handling for empty JSON + Response response = UtilIT.setSettings("{}"); + + response.then() + .assertThat() + .statusCode(BAD_REQUEST.getStatusCode()) + .body("message", containsString("Empty or invalid JSON object")); + } + + @Test + void testPutAllSettingsWithInvalidSetting() { + // Test error handling for empty JSON + Response response = UtilIT.setSettings("{\":Test1\": \"Foobar\", \":Test2\": \"Foobar\" }"); + + response.then() + .assertThat() + .statusCode(BAD_REQUEST.getStatusCode()) + .body("message", containsString("Invalid key(s): :Test1, :Test2")); + } + } + + @Test public void testListAuthenticatedUsers() throws Exception { Response anon = UtilIT.listAuthenticatedUsers(testNonSuperuserApiToken); @@ -75,7 +219,7 @@ public void testListAuthenticatedUsers() throws Exception { Response deleteSuperuser = UtilIT.deleteUser(superuserUsername); assertEquals(200, deleteSuperuser.getStatusCode()); - } +} @Test public void testFilterAuthenticatedUsersForbidden() throws Exception { diff --git a/src/test/java/edu/harvard/iq/dataverse/api/BagIT.java b/src/test/java/edu/harvard/iq/dataverse/api/BagIT.java index c80e321b228..16c44003f35 100644 --- a/src/test/java/edu/harvard/iq/dataverse/api/BagIT.java +++ b/src/test/java/edu/harvard/iq/dataverse/api/BagIT.java @@ -2,6 +2,8 @@ import edu.harvard.iq.dataverse.engine.command.impl.LocalSubmitToArchiveCommand; import edu.harvard.iq.dataverse.settings.SettingsServiceBean; +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.BagGeneratorThreads; +import static edu.harvard.iq.dataverse.settings.SettingsServiceBean.Key.BagItLocalPath; import io.restassured.RestAssured; import static io.restassured.RestAssured.given; import io.restassured.response.Response; @@ -36,11 +38,13 @@ public static void setUpClass() { setArchiverClassName.then().assertThat() .statusCode(OK.getStatusCode()); - Response setArchiverSettings = UtilIT.setSetting(SettingsServiceBean.Key.ArchiverSettings, ":BagItLocalPath, :BagGeneratorThreads"); + // BagGeneratorThreads isn't used. Consider setting it or removing it. + Response setArchiverSettings = UtilIT.setSetting(SettingsServiceBean.Key.ArchiverSettings, + String.join(", ", BagItLocalPath.toString(), BagGeneratorThreads.toString())); setArchiverSettings.then().assertThat() .statusCode(OK.getStatusCode()); - Response setBagItLocalPath = UtilIT.setSetting(":BagItLocalPath", bagitExportDir); + Response setBagItLocalPath = UtilIT.setSetting(BagItLocalPath.toString(), bagitExportDir); setBagItLocalPath.then().assertThat() .statusCode(OK.getStatusCode()); diff --git a/src/test/java/edu/harvard/iq/dataverse/api/DatasetsIT.java b/src/test/java/edu/harvard/iq/dataverse/api/DatasetsIT.java index f627779e14a..41f79f3dab5 100644 --- a/src/test/java/edu/harvard/iq/dataverse/api/DatasetsIT.java +++ b/src/test/java/edu/harvard/iq/dataverse/api/DatasetsIT.java @@ -2401,7 +2401,7 @@ public void testFileChecksum() { Response getDefaultSetting = UtilIT.getSetting(SettingsServiceBean.Key.FileFixityChecksumAlgorithm); getDefaultSetting.prettyPrint(); getDefaultSetting.then().assertThat() - .body("message", equalTo("Setting :FileFixityChecksumAlgorithm not found")); + .body("message", equalTo("Setting :FileFixityChecksumAlgorithm not found.")); Response uploadMd5File = UtilIT.uploadRandomFile(dataset1PersistentId, apiToken); uploadMd5File.prettyPrint(); diff --git a/src/test/java/edu/harvard/iq/dataverse/api/DataversesIT.java b/src/test/java/edu/harvard/iq/dataverse/api/DataversesIT.java index 307af623120..cfa91e44cc0 100644 --- a/src/test/java/edu/harvard/iq/dataverse/api/DataversesIT.java +++ b/src/test/java/edu/harvard/iq/dataverse/api/DataversesIT.java @@ -22,6 +22,7 @@ import java.text.MessageFormat; import java.util.Arrays; import java.util.List; +import java.util.Map; import java.util.logging.Logger; import jakarta.json.Json; @@ -30,8 +31,12 @@ import jakarta.ws.rs.core.Response.Status; import org.junit.jupiter.api.AfterAll; +import org.junit.jupiter.api.AfterEach; import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.Test; +import org.junit.jupiter.api.parallel.Isolated; +import org.junit.jupiter.api.parallel.ResourceAccessMode; +import org.junit.jupiter.api.parallel.ResourceLock; import static jakarta.ws.rs.core.Response.Status.*; import static org.hamcrest.CoreMatchers.*; @@ -49,6 +54,8 @@ import org.hamcrest.Matchers; import static org.hamcrest.Matchers.greaterThan; +@ResourceLock(value = "MetadataLanguages", mode = ResourceAccessMode.READ_WRITE) +@Isolated public class DataversesIT { private static final Logger logger = Logger.getLogger(DataversesIT.class.getCanonicalName()); @@ -56,11 +63,18 @@ public class DataversesIT { @BeforeAll public static void setUpClass() { RestAssured.baseURI = UtilIT.getRestAssuredBaseUri(); + UtilIT.deleteSetting(SettingsServiceBean.Key.MetadataLanguages); } @AfterAll public static void afterClass() { Response removeExcludeEmail = UtilIT.deleteSetting(SettingsServiceBean.Key.ExcludeEmailFromExport); + UtilIT.deleteSetting(SettingsServiceBean.Key.MetadataLanguages); + } + + @AfterEach + public void afterEach() { + UtilIT.deleteSetting(SettingsServiceBean.Key.MetadataLanguages); } @Test @@ -2745,4 +2759,37 @@ private String getSuperuserToken() { UtilIT.makeSuperUser(username); return adminApiToken; } + + @Test + public void testDataverseMetadataLanguage() { + Response createUser = UtilIT.createRandomUser(); + createUser.prettyPrint(); + String apiToken = UtilIT.getApiTokenFromResponse(createUser); + Response createDataverse1Response = UtilIT.createRandomDataverse(apiToken); + + createDataverse1Response.prettyPrint(); + createDataverse1Response.then().assertThat().statusCode(CREATED.getStatusCode()); + + String alias = UtilIT.getAliasFromResponse(createDataverse1Response); + + Response noLang = UtilIT.getDataverseMetadataLanguage(alias, apiToken); + noLang.prettyPrint(); + + noLang.then().assertThat().body("data", equalTo(List.of())); + + UtilIT.setSetting(SettingsServiceBean.Key.MetadataLanguages, + "[{\"locale\":\"en\",\"title\":\"English\"},{\"locale\":\"hu\",\"title\":\"magyar\"}]"); + Response allLangs = UtilIT.getDataverseMetadataLanguage(alias, apiToken); + allLangs.prettyPrint(); + allLangs.then().assertThat() + .body("data.size()", equalTo(2)) + .and().body("data[0].locale", equalTo("en")) + .and().body("data[1].locale", equalTo("hu")); + + Response english = UtilIT.setDataverseMetadataLanguage(alias, apiToken, "en"); + english.then().assertThat().body("data", equalTo(List.of(Map.of("locale", "en", "title", "English")))); + Response singleLang = UtilIT.getDataverseMetadataLanguage(alias, apiToken); + singleLang.then().assertThat().body("data", equalTo(List.of(Map.of("locale", "en", "title", "English")))); + } + } diff --git a/src/test/java/edu/harvard/iq/dataverse/api/FilesIT.java b/src/test/java/edu/harvard/iq/dataverse/api/FilesIT.java index 175d93b57a6..83eb80104f3 100644 --- a/src/test/java/edu/harvard/iq/dataverse/api/FilesIT.java +++ b/src/test/java/edu/harvard/iq/dataverse/api/FilesIT.java @@ -60,11 +60,14 @@ public static void setUpClass() { Response removePublicInstall = UtilIT.deleteSetting(SettingsServiceBean.Key.PublicInstall); removePublicInstall.then().assertThat().statusCode(200); + Response removeLimit = UtilIT.deleteSetting(SettingsServiceBean.Key.TabularIngestSizeLimit); + removeLimit.then().assertThat().statusCode(OK.getStatusCode()); } @AfterAll public static void tearDownClass() { UtilIT.deleteSetting(SettingsServiceBean.Key.PublicInstall); + UtilIT.deleteSetting(SettingsServiceBean.Key.TabularIngestSizeLimit); } /** @@ -1208,7 +1211,183 @@ public void test_AddFileBadUploadFormat() { } } - + + @Test + public void testIngestSizeLimits() throws InterruptedException, IOException { + Response createUser = UtilIT.createRandomUser(); + createUser.then().assertThat().statusCode(OK.getStatusCode()); + String username = UtilIT.getUsernameFromResponse(createUser); + String apiToken = UtilIT.getApiTokenFromResponse(createUser); + Response makeSuperUser = UtilIT.setSuperuserStatus(username, true); + makeSuperUser.then().assertThat().statusCode(OK.getStatusCode()); + + Response createDataverseResponse = UtilIT.createRandomDataverse(apiToken); + createDataverseResponse.prettyPrint(); + String dataverseAlias = UtilIT.getAliasFromResponse(createDataverseResponse); + Response createDatasetResponse = UtilIT.createRandomDatasetViaNativeApi(dataverseAlias, apiToken); + createDatasetResponse.prettyPrint(); + Integer datasetId = JsonPath.from(createDatasetResponse.body().asString()).getInt("data.id"); + + String tinyCsvOnly = """ +{ + "csv": "50" +} +"""; + + Response setLimit = UtilIT.setSetting(SettingsServiceBean.Key.TabularIngestSizeLimit, tinyCsvOnly); + setLimit.then().assertThat().statusCode(OK.getStatusCode()); + + Path pathToDataFile = Paths.get(java.nio.file.Files.createTempDirectory(null) + File.separator + "data.csv"); + String contentOfCsv = "" + + "name,pounds,species,treats\n" + + "Midnight,15,dog,milkbones\n" + + "Tiger,17,cat,cat grass\n" + + "Panther,21,cat,cat nip\n"; + java.nio.file.Files.write(pathToDataFile, contentOfCsv.getBytes()); + + Response uploadFile = UtilIT.uploadFileViaNative(datasetId.toString(), pathToDataFile.toString(), apiToken); + uploadFile.prettyPrint(); + uploadFile.then().assertThat() + .statusCode(OK.getStatusCode()) + .body("data.files[0].label", equalTo("data.csv")); + + String fileId1 = JsonPath.from(uploadFile.body().asString()).getString("data.files[0].dataFile.id"); + + Response getTabularFails = UtilIT.getFileDataTables(fileId1, apiToken); + getTabularFails.prettyPrint(); + getTabularFails.then().assertThat() + .statusCode(BAD_REQUEST.getStatusCode()) + .body("message", equalTo(BundleUtil.getStringFromBundle("files.api.only.tabular.supported"))); + + String largerCsv = """ +{ + "csv": "123456" +} +"""; + + setLimit = UtilIT.setSetting(SettingsServiceBean.Key.TabularIngestSizeLimit, largerCsv); + setLimit.then().assertThat().statusCode(OK.getStatusCode()); + + uploadFile = UtilIT.uploadFileViaNative(datasetId.toString(), pathToDataFile.toString(), apiToken); + uploadFile.prettyPrint(); + uploadFile.then().assertThat() + .statusCode(OK.getStatusCode()) + .body("data.files[0].label", equalTo("data-1.csv")); + + assertTrue(UtilIT.sleepForLock(datasetId.longValue(), "Ingest", apiToken, UtilIT.MAXIMUM_INGEST_LOCK_DURATION), "Failed test if Ingest Lock exceeds max duration " + pathToDataFile); + + String fileId2 = JsonPath.from(uploadFile.body().asString()).getString("data.files[0].dataFile.id"); + + Response getTabularWorks = UtilIT.getFileDataTables(fileId2, apiToken); + getTabularWorks.prettyPrint(); + getTabularWorks.then().assertThat() + .statusCode(OK.getStatusCode()) + .body("data[0].varQuantity", equalTo(4)); + + String tinyDefaultSize = """ +{ + "default": "50" +} +"""; + + setLimit = UtilIT.setSetting(SettingsServiceBean.Key.TabularIngestSizeLimit, tinyDefaultSize); + setLimit.then().assertThat().statusCode(OK.getStatusCode()); + + uploadFile = UtilIT.uploadFileViaNative(datasetId.toString(), pathToDataFile.toString(), apiToken); + uploadFile.prettyPrint(); + uploadFile.then().assertThat() + .statusCode(OK.getStatusCode()) + .body("data.files[0].label", equalTo("data-2.csv")); + + String fileId3 = JsonPath.from(uploadFile.body().asString()).getString("data.files[0].dataFile.id"); + + getTabularFails = UtilIT.getFileDataTables(fileId3, apiToken); + getTabularFails.prettyPrint(); + getTabularFails.then().assertThat() + .statusCode(BAD_REQUEST.getStatusCode()) + .body("message", equalTo(BundleUtil.getStringFromBundle("files.api.only.tabular.supported"))); + + // The behavior of `"default": "-2"` is not documented in the guides + // but it acts like `"default": "0"` which disables ingest. + String unexpectedNegativeDefault = """ +{ + "default": "-2" +} +"""; + + setLimit = UtilIT.setSetting(SettingsServiceBean.Key.TabularIngestSizeLimit, unexpectedNegativeDefault); + setLimit.then().assertThat().statusCode(OK.getStatusCode()); + + uploadFile = UtilIT.uploadFileViaNative(datasetId.toString(), pathToDataFile.toString(), apiToken); + uploadFile.prettyPrint(); + uploadFile.then().assertThat() + .statusCode(OK.getStatusCode()) + .body("data.files[0].label", equalTo("data-3.csv")); + + String fileId4 = JsonPath.from(uploadFile.body().asString()).getString("data.files[0].dataFile.id"); + + getTabularFails = UtilIT.getFileDataTables(fileId4, apiToken); + getTabularFails.prettyPrint(); + getTabularFails.then().assertThat() + .statusCode(BAD_REQUEST.getStatusCode()) + .body("message", equalTo(BundleUtil.getStringFromBundle("files.api.only.tabular.supported"))); + + // As the guides say, you MUST provide a string, not a JSON number. + // That is, `"123"` in quotes rather than `123` with no quotes. + // If you provide a number (no quotes) rather than a string, + // all ingest will be disabled and you'll see an error in server.log + // about how the system is misconfigured. + String invalidNonString = """ +{ + "default": 987654321 +} +"""; + + setLimit = UtilIT.setSetting(SettingsServiceBean.Key.TabularIngestSizeLimit, invalidNonString); + setLimit.then().assertThat().statusCode(OK.getStatusCode()); + + uploadFile = UtilIT.uploadFileViaNative(datasetId.toString(), pathToDataFile.toString(), apiToken); + uploadFile.prettyPrint(); + uploadFile.then().assertThat() + .statusCode(OK.getStatusCode()) + .body("data.files[0].label", equalTo("data-4.csv")); + + String fileId5 = JsonPath.from(uploadFile.body().asString()).getString("data.files[0].dataFile.id"); + + getTabularFails = UtilIT.getFileDataTables(fileId5, apiToken); + getTabularFails.prettyPrint(); + getTabularFails.then().assertThat() + .statusCode(BAD_REQUEST.getStatusCode()) + .body("message", equalTo(BundleUtil.getStringFromBundle("files.api.only.tabular.supported"))); + + String defaultDisabledAndLargeCsvLimit = """ +{ + "default": "0", + "csv": "123456" +} +"""; + + setLimit = UtilIT.setSetting(SettingsServiceBean.Key.TabularIngestSizeLimit, defaultDisabledAndLargeCsvLimit); + setLimit.then().assertThat().statusCode(OK.getStatusCode()); + + uploadFile = UtilIT.uploadFileViaNative(datasetId.toString(), pathToDataFile.toString(), apiToken); + uploadFile.prettyPrint(); + uploadFile.then().assertThat() + .statusCode(OK.getStatusCode()) + .body("data.files[0].label", equalTo("data-5.csv")); + + String fileId6 = JsonPath.from(uploadFile.body().asString()).getString("data.files[0].dataFile.id"); + + getTabularWorks = UtilIT.getFileDataTables(fileId2, apiToken); + getTabularWorks.prettyPrint(); + getTabularWorks.then().assertThat() + .statusCode(OK.getStatusCode()) + .body("data[0].varQuantity", equalTo(4)); + + Response removeLimit = UtilIT.deleteSetting(SettingsServiceBean.Key.TabularIngestSizeLimit); + removeLimit.then().assertThat().statusCode(OK.getStatusCode()); + } + @Test public void testUningestFileViaApi() throws InterruptedException { Response createUser = UtilIT.createRandomUser(); diff --git a/src/test/java/edu/harvard/iq/dataverse/api/UtilIT.java b/src/test/java/edu/harvard/iq/dataverse/api/UtilIT.java index 5a07769b313..496ae826e02 100644 --- a/src/test/java/edu/harvard/iq/dataverse/api/UtilIT.java +++ b/src/test/java/edu/harvard/iq/dataverse/api/UtilIT.java @@ -2526,10 +2526,20 @@ static Response deleteSetting(String settingKey) { return response; } + static Response getSettings() { + Response response = given().when().get("/api/admin/settings"); + return response; + } + static Response getSetting(SettingsServiceBean.Key settingKey) { Response response = given().when().get("/api/admin/settings/" + settingKey); return response; } + + static Response getSetting(SettingsServiceBean.Key settingKey, String language) { + Response response = given().when().get("/api/admin/settings/" + settingKey + "/lang/" + language); + return response; + } static Response setSetting(SettingsServiceBean.Key settingKey, String value) { Response response = given().body(value).when().put("/api/admin/settings/" + settingKey); @@ -2549,6 +2559,15 @@ public static Response setSetting(String settingKey, String value) { return response; } + public static Response setSettings(String value) { + Response response = given() + .header("Content-Type", "application/json") + .body(value) + .when() + .put("/api/admin/settings"); + return response; + } + static Response getFeatureFlags() { Response response = given().when().get("/api/admin/featureFlags"); return response; @@ -4700,6 +4719,25 @@ static Response updateDatasetTypeAvailableLicense(String idOrName, String jsonAr .put("/api/datasets/datasetTypes/" + idOrName + "/licenses"); } + public static Response getWorkflowIpWhitelist() { + Response response = given() + .get("/api/admin/workflows/ip-whitelist"); + return response; + } + + public static Response setWorkflowIpWhitelist(String iPWhitelist) { + Response response = given() + .body(iPWhitelist) + .put("/api/admin/workflows/ip-whitelist"); + return response; + } + + public static Response deleteWorkflowIpWhitelist() { + Response response = given() + .delete("/api/admin/workflows/ip-whitelist"); + return response; + } + static Response registerOidcUser(String jsonIn, String bearerToken) { return given() .header(HttpHeaders.AUTHORIZATION, bearerToken) @@ -5113,6 +5151,22 @@ public static Response callCallbackUrl(String callbackUrl) { .get(callbackUrl); } + public static Response getDataverseMetadataLanguage(String alias, String apiToken) { + return given() + .header(API_TOKEN_HTTP_HEADER, apiToken) + .get("/api/dataverses/" + + alias + + "/allowedMetadataLanguages"); + } + + public static Response setDataverseMetadataLanguage(String alias, String apiToken, String lang) { + return given() + .header(API_TOKEN_HTTP_HEADER, apiToken) + .put("/api/dataverses/" + + alias + + "/allowedMetadataLanguages/" + + lang); + } public static Response getDataverseRoleAssignmentHistory(String dataverseAlias, boolean downloadAsCsv, String apiToken) { RequestSpecification requestSpecification = given() diff --git a/src/test/java/edu/harvard/iq/dataverse/api/WorkflowsIT.java b/src/test/java/edu/harvard/iq/dataverse/api/WorkflowsIT.java new file mode 100644 index 00000000000..4b94fe6ee68 --- /dev/null +++ b/src/test/java/edu/harvard/iq/dataverse/api/WorkflowsIT.java @@ -0,0 +1,64 @@ +package edu.harvard.iq.dataverse.api; + +import edu.harvard.iq.dataverse.settings.SettingsServiceBean; +import io.restassured.RestAssured; +import static io.restassured.RestAssured.given; +import io.restassured.response.Response; +import static jakarta.ws.rs.core.Response.Status.BAD_REQUEST; +import static jakarta.ws.rs.core.Response.Status.INTERNAL_SERVER_ERROR; +import static jakarta.ws.rs.core.Response.Status.OK; +import static org.hamcrest.CoreMatchers.equalTo; +import org.junit.jupiter.api.AfterAll; +import org.junit.jupiter.api.BeforeAll; +import org.junit.jupiter.api.Test; + +public class WorkflowsIT { + + @BeforeAll + public static void setUpClass() { + RestAssured.baseURI = UtilIT.getRestAssuredBaseUri(); + + UtilIT.deleteWorkflowIpWhitelist(); + } + + @AfterAll + public static void afterClass() { + } + + @Test + public void testIpWhitelist() { + Response response = null; + + response = UtilIT.getWorkflowIpWhitelist(); + response.prettyPrint(); + response.then().assertThat() + .statusCode(OK.getStatusCode()) + .body("data.message", equalTo("127.0.0.1;::1")); + + String testIp = "192.168.0.1;192.168.0.2"; + + response = UtilIT.setWorkflowIpWhitelist("junk"); + response.prettyPrint(); + response.then().assertThat() + .statusCode(BAD_REQUEST.getStatusCode()) + .body("message", equalTo("Request contains illegal IP addresses.")); + + response = UtilIT.setWorkflowIpWhitelist(testIp); + response.prettyPrint(); + response.then().assertThat() + .statusCode(OK.getStatusCode()); + + response = UtilIT.getWorkflowIpWhitelist(); + response.prettyPrint(); + response.then().assertThat() + .statusCode(OK.getStatusCode()) + .body("data.message", equalTo(testIp)); + + response = given().when().get("/api/admin/settings/" + SettingsServiceBean.Key.WorkflowsAdminIpWhitelist); + response.prettyPrint(); + response.then().assertThat() + .statusCode(OK.getStatusCode()) + .body("data.message", equalTo(testIp)); + } + +} diff --git a/src/test/java/edu/harvard/iq/dataverse/export/ddi/DdiExportUtilTest.java b/src/test/java/edu/harvard/iq/dataverse/export/ddi/DdiExportUtilTest.java index 360e9dfbafe..03000d55b5a 100644 --- a/src/test/java/edu/harvard/iq/dataverse/export/ddi/DdiExportUtilTest.java +++ b/src/test/java/edu/harvard/iq/dataverse/export/ddi/DdiExportUtilTest.java @@ -19,8 +19,6 @@ import java.nio.charset.StandardCharsets; import java.nio.file.Files; import java.nio.file.Path; -import java.nio.file.Paths; -import java.util.Arrays; import java.util.HashMap; import java.util.List; import java.util.Map; @@ -93,7 +91,7 @@ public static void setUpClass() throws Exception { PidUtil.clearPidProviders(); //Read list of providers to add - List providers = Arrays.asList(JvmSettings.PID_PROVIDERS.lookup().split(",\\s")); + List providers = JvmSettings.PID_PROVIDERS.lookupSplittedList(); //Iterate through the list of providers and add them using the PidProviderFactory of the appropriate type for (String providerId : providers) { System.out.println("Loading provider: " + providerId); diff --git a/src/test/java/edu/harvard/iq/dataverse/filter/CorsFilterTest.java b/src/test/java/edu/harvard/iq/dataverse/filter/CorsFilterTest.java new file mode 100644 index 00000000000..8db5d43e14d --- /dev/null +++ b/src/test/java/edu/harvard/iq/dataverse/filter/CorsFilterTest.java @@ -0,0 +1,224 @@ +package edu.harvard.iq.dataverse.filter; + +import jakarta.servlet.FilterChain; +import jakarta.servlet.ServletRequest; +import jakarta.servlet.ServletResponse; +import jakarta.servlet.http.HttpServletRequest; +import jakarta.servlet.http.HttpServletResponse; +import org.junit.jupiter.api.AfterEach; +import org.junit.jupiter.api.BeforeEach; +import org.junit.jupiter.api.Test; +import org.mockito.ArgumentCaptor; + +import java.util.HashMap; +import java.util.Map; + +import static org.junit.jupiter.api.Assertions.*; +import static org.mockito.ArgumentMatchers.any; +import static org.mockito.ArgumentMatchers.anyString; +import static org.mockito.ArgumentMatchers.argThat; +import static org.mockito.ArgumentMatchers.contains; +import static org.mockito.ArgumentMatchers.eq; +import static org.mockito.Mockito.*; + +class CorsFilterTest { + + private final Map sysPropsBackup = new HashMap<>(); + + @BeforeEach + void setUp() { + // backup potentially touched props + backupAndClear("dataverse.cors.origin"); + backupAndClear("dataverse.cors.methods"); + backupAndClear("dataverse.cors.headers.allow"); + backupAndClear("dataverse.cors.headers.expose"); + } + + @AfterEach + void tearDown() { + restore("dataverse.cors.origin"); + restore("dataverse.cors.methods"); + restore("dataverse.cors.headers.allow"); + restore("dataverse.cors.headers.expose"); + } + + @Test + void wildcardOrigin_allowsAny_noVary() throws Exception { + System.setProperty("dataverse.cors.origin", "*"); + + CorsFilter sut = new CorsFilter(); + sut.init(null); + + HttpServletRequest req = mock(HttpServletRequest.class); + when(req.getHeader("Origin")).thenReturn("https://a.example"); + HttpServletResponse res = mock(HttpServletResponse.class); + FilterChain chain = mock(FilterChain.class); + + sut.doFilter(req, res, chain); + + verify(res).setHeader("Access-Control-Allow-Origin", "*"); + // By design, Vary not required for wildcard + verify(res, never()).setHeader(eq("Vary"), anyString()); + verify(chain).doFilter(any(ServletRequest.class), any(ServletResponse.class)); + } + + @Test + void singleOrigin_echoesAndAddsVary() throws Exception { + System.setProperty("dataverse.cors.origin", "https://libis.github.io"); + + CorsFilter sut = new CorsFilter(); + sut.init(null); + + HttpServletRequest req = mock(HttpServletRequest.class); + when(req.getHeader("Origin")).thenReturn("https://libis.github.io"); + HttpServletResponse res = mock(HttpServletResponse.class); + when(res.getHeader("Vary")).thenReturn(null); + FilterChain chain = mock(FilterChain.class); + + sut.doFilter(req, res, chain); + + verify(res).setHeader("Access-Control-Allow-Origin", "https://libis.github.io"); + + ArgumentCaptor varyVal = ArgumentCaptor.forClass(String.class); + verify(res).setHeader(eq("Vary"), varyVal.capture()); + assertTrue(varyVal.getValue().contains("Origin")); + verify(chain).doFilter(any(ServletRequest.class), any(ServletResponse.class)); + } + + @Test + void multipleOrigins_echoesMatch_onlyWhenAllowed() throws Exception { + // Comma-separated list as set via JVM options/Microprofile + System.setProperty("dataverse.cors.origin", "https://a.example, https://b.example"); + + CorsFilter sut = new CorsFilter(); + sut.init(null); + + // allowed origin + HttpServletRequest reqAllowed = mock(HttpServletRequest.class); + when(reqAllowed.getHeader("Origin")).thenReturn("https://b.example"); + HttpServletResponse resAllowed = mock(HttpServletResponse.class); + FilterChain chain = mock(FilterChain.class); + + sut.doFilter(reqAllowed, resAllowed, chain); + verify(resAllowed).setHeader("Access-Control-Allow-Origin", "https://b.example"); + verify(resAllowed).setHeader(eq("Vary"), contains("Origin")); + + // not allowed origin -> no ACAO header set + HttpServletRequest reqDenied = mock(HttpServletRequest.class); + when(reqDenied.getHeader("Origin")).thenReturn("https://c.example"); + HttpServletResponse resDenied = mock(HttpServletResponse.class); + + sut.doFilter(reqDenied, resDenied, chain); + verify(resDenied, never()).setHeader(eq("Access-Control-Allow-Origin"), anyString()); + } + + @Test + void whitespaceAndMixedCasingParsing() throws Exception { + System.setProperty("dataverse.cors.origin", + " https://one.example ,\n\t https://two.example , https://three.example "); + + CorsFilter sut = new CorsFilter(); + sut.init(null); + + HttpServletRequest req = mock(HttpServletRequest.class); + when(req.getHeader("Origin")).thenReturn("https://two.example"); + HttpServletResponse res = mock(HttpServletResponse.class); + when(res.getHeader("Vary")).thenReturn("Accept-Encoding"); + + sut.doFilter(req, res, mock(FilterChain.class)); + + verify(res).setHeader("Access-Control-Allow-Origin", "https://two.example"); + // ensure existing Vary preserved and Origin added + verify(res).setHeader(eq("Vary"), argThat(v -> v.contains("Origin") && v.contains("Accept-Encoding"))); + } + + @Test + void wildcardAmongOthersTreatsAsWildcard() throws Exception { + System.setProperty("dataverse.cors.origin", "https://a.example,*,https://b.example"); + + CorsFilter sut = new CorsFilter(); + sut.init(null); + + HttpServletRequest req = mock(HttpServletRequest.class); + when(req.getHeader("Origin")).thenReturn("https://random.example"); + HttpServletResponse res = mock(HttpServletResponse.class); + + sut.doFilter(req, res, mock(FilterChain.class)); + + verify(res).setHeader("Access-Control-Allow-Origin", "*"); + verify(res, never()).setHeader(eq("Vary"), anyString()); + } + + @Test + void existingVaryMergedWithoutDuplication() throws Exception { + System.setProperty("dataverse.cors.origin", "https://merge.example"); + + CorsFilter sut = new CorsFilter(); + sut.init(null); + + HttpServletRequest req = mock(HttpServletRequest.class); + when(req.getHeader("Origin")).thenReturn("https://merge.example"); + HttpServletResponse res = mock(HttpServletResponse.class); + when(res.getHeader("Vary")).thenReturn("Accept-Encoding, Origin"); + + sut.doFilter(req, res, mock(FilterChain.class)); + + // Origin should not be duplicated + verify(res).setHeader(eq("Vary"), argThat(v -> v.indexOf("Origin") == v.lastIndexOf("Origin"))); + } + + @Test + void quotedHeaderListsPreserved() throws Exception { + System.setProperty("dataverse.cors.origin", "https://x.example"); + System.setProperty("dataverse.cors.headers.allow", "\"Accept, X-Dataverse-key\""); + System.setProperty("dataverse.cors.headers.expose", "\"Accept-Ranges, Content-Range\""); + System.setProperty("dataverse.cors.methods", "GET, POST, OPTIONS"); + + CorsFilter sut = new CorsFilter(); + sut.init(null); + + HttpServletRequest req = mock(HttpServletRequest.class); + when(req.getHeader("Origin")).thenReturn("https://x.example"); + HttpServletResponse res = mock(HttpServletResponse.class); + + sut.doFilter(req, res, mock(FilterChain.class)); + + // With simplified CsvUtil we now preserve surrounding quotes provided by admin config. + verify(res).setHeader("Access-Control-Allow-Headers", "\"Accept, X-Dataverse-key\""); + verify(res).setHeader("Access-Control-Expose-Headers", "\"Accept-Ranges, Content-Range\""); + verify(res).setHeader("Access-Control-Allow-Methods", "GET, POST, OPTIONS"); + } + + @Test + void disabledCors_skipsHeaders() throws Exception { + // no origin set -> CORS disabled + CorsFilter sut = new CorsFilter(); + sut.init(null); + + HttpServletRequest req = mock(HttpServletRequest.class); + when(req.getHeader("Origin")).thenReturn("https://any.example"); + HttpServletResponse res = mock(HttpServletResponse.class); + + sut.doFilter(req, res, mock(FilterChain.class)); + + verify(res, never()).setHeader(eq("Access-Control-Allow-Origin"), anyString()); + verify(res, never()).setHeader(eq("Access-Control-Allow-Methods"), anyString()); + verify(res, never()).setHeader(eq("Access-Control-Allow-Headers"), anyString()); + verify(res, never()).setHeader(eq("Access-Control-Expose-Headers"), anyString()); + } + + private void backupAndClear(String key) { + String old = System.getProperty(key); + if (old != null) { + sysPropsBackup.put(key, old); + } + System.clearProperty(key); + } + + private void restore(String key) { + System.clearProperty(key); + if (sysPropsBackup.containsKey(key)) { + System.setProperty(key, sysPropsBackup.get(key)); + } + } +} diff --git a/src/test/java/edu/harvard/iq/dataverse/pidproviders/PidUtilTest.java b/src/test/java/edu/harvard/iq/dataverse/pidproviders/PidUtilTest.java index 3f8c198b0fe..201d3c6c25d 100644 --- a/src/test/java/edu/harvard/iq/dataverse/pidproviders/PidUtilTest.java +++ b/src/test/java/edu/harvard/iq/dataverse/pidproviders/PidUtilTest.java @@ -153,7 +153,7 @@ public static void setUpClass() throws Exception { PidUtil.clearPidProviders(); //Read list of providers to add - List providers = Arrays.asList(JvmSettings.PID_PROVIDERS.lookup().split(",\\s")); + List providers = JvmSettings.PID_PROVIDERS.lookupSplittedList(); //Iterate through the list of providers and add them using the PidProviderFactory of the appropriate type for (String providerId : providers) { System.out.println("Loading provider: " + providerId); diff --git a/src/test/java/edu/harvard/iq/dataverse/settings/SettingsServiceBeanTest.java b/src/test/java/edu/harvard/iq/dataverse/settings/SettingsServiceBeanTest.java new file mode 100644 index 00000000000..c4881257374 --- /dev/null +++ b/src/test/java/edu/harvard/iq/dataverse/settings/SettingsServiceBeanTest.java @@ -0,0 +1,433 @@ +package edu.harvard.iq.dataverse.settings; + +import jakarta.json.Json; +import jakarta.json.JsonArray; +import jakarta.json.JsonObject; +import jakarta.persistence.EntityManager; +import jakarta.persistence.TypedQuery; +import org.junit.jupiter.api.AfterEach; +import org.junit.jupiter.api.BeforeAll; +import org.junit.jupiter.api.Nested; +import org.junit.jupiter.api.Test; +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.Arguments; +import org.junit.jupiter.params.provider.CsvSource; +import org.junit.jupiter.params.provider.MethodSource; +import org.junit.jupiter.params.provider.ValueSource; +import org.mockito.ArgumentMatchers; + +import java.util.Collections; +import java.util.List; +import java.util.Map; +import java.util.Set; + +import static org.junit.jupiter.api.Assertions.assertDoesNotThrow; +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertThrows; +import static org.mockito.ArgumentMatchers.any; +import static org.mockito.Mockito.clearInvocations; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.never; +import static org.mockito.Mockito.verify; +import static org.mockito.Mockito.when; + +class SettingsServiceBeanTest { + + @Nested + class KeyEnumTest { + static List parseTestParameters() { + return List.of( + Arguments.of(null, null), + Arguments.of("", null), + Arguments.of(" ", null), + Arguments.of("foobar", null), + Arguments.of("ShowMuteOptions", null), + Arguments.of(":FooBar", null), + Arguments.of(":ShowMuteOptions", SettingsServiceBean.Key.ShowMuteOptions) + ); + } + + @MethodSource("parseTestParameters") + @ParameterizedTest + void testParse(String sut, SettingsServiceBean.Key expected) { + assertEquals(expected, SettingsServiceBean.Key.parse(sut)); + } + + @Test + void testToString() { + // Make sure we test the intended behavior so it doesn't change by accident. + assertEquals(":ShowMuteOptions", SettingsServiceBean.Key.ShowMuteOptions.toString()); + } + + @Test + void testRoundtrip() { + for (SettingsServiceBean.Key key : SettingsServiceBean.Key.values()) { + assertEquals(key, SettingsServiceBean.Key.parse(key.toString())); + } + } + } + + @Nested + class ValidateSettingNameTest { + + @ValueSource(strings = {":ShowMuteOptions", ":AllowApiTokenLookupViaApi", ":OAuth2CallbackUrl"}) + @ParameterizedTest + void testValidateSettingName_validNames(String name) { + assertDoesNotThrow(() -> SettingsServiceBean.validateSettingName(name)); + } + + @CsvSource({ + "invalidName, 'The name of the setting is invalid.'", + ":invalid:suffix, 'The name of the setting may not have a colon separated suffix since Dataverse 6.8. Please update your scripts.'", + ":NonExistentKey, 'The name of the setting is invalid.'", + ":ShowMuteOptions/lang/en, 'The name of the setting is invalid.'" + }) + @ParameterizedTest + void testValidateSettingName_invalidNames(String name, String expectedMessage) { + SettingsValidationException exception = assertThrows(SettingsValidationException.class, + () -> SettingsServiceBean.validateSettingName(name)); + assertEquals(expectedMessage, exception.getMessage()); + } + } + + @Nested + class ValidateSettingLangTest { + + @ValueSource(strings = {"en", "fr", "de"}) + @ParameterizedTest + void testValidateSettingLang_validLanguage(String language) { + assertDoesNotThrow(() -> SettingsServiceBean.validateSettingLang(language)); + } + + @CsvSource({ + ", 'The language ''null'' is not a valid ISO 639-1 language code.'", + "e, 'The language ''e'' is not a valid ISO 639-1 language code.'", + "xyz, 'The language ''xyz'' is not a valid ISO 639-1 language code.'", + "zz, 'The language ''zz'' is not a valid ISO 639-1 language code.'" + }) + @ParameterizedTest + void testValidateSettingLang_invalidLanguage(String language, String expectedMessage) { + SettingsValidationException exception = assertThrows(SettingsValidationException.class, + () -> SettingsServiceBean.validateSettingLang(language)); + assertEquals(expectedMessage, exception.getMessage()); + } + } + + @Nested + class ValidateKeysTest { + static List validateKeysTestParameters() { + return List.of( + Arguments.of( + Json.createObjectBuilder() + .add(":ApplicationTermsOfUse", "validValue1") + .add(":ApplicationTermsOfUse/lang/en", "validValue2") + .build(), + List.of() + ), + Arguments.of( + Json.createObjectBuilder() + .add(":Invalid:Key", "value1") + .add(":NonExistentKey/lang/fr", "value2") + .build(), + List.of(":Invalid:Key", ":NonExistentKey/lang/fr") + ), + Arguments.of( + Json.createObjectBuilder() + .add(":ApplicationTermsOfUse", "value3") + .add("NoColonKey", "value4") + .build(), + List.of("NoColonKey") + ) + ); + } + + @MethodSource("validateKeysTestParameters") + @ParameterizedTest + void testValidateKeys(JsonObject input, List expectedInvalidKeys) { + List result = SettingsServiceBean.validateKeys(input); + assertEquals(expectedInvalidKeys, result); + } + } + + @Nested + class ListAllAsJsonTest { + + static TypedQuery typedQuery = mock(TypedQuery.class); + static EntityManager em = mock(EntityManager.class); + static SettingsServiceBean settingsServiceBean = new SettingsServiceBean(); + + @BeforeAll + static void setup() { + settingsServiceBean.em = em; + + when(em.createNamedQuery( + ArgumentMatchers.eq("Setting.findAll"), + ArgumentMatchers.eq(Setting.class))) + .thenReturn(typedQuery); + } + + @Test + void testListAllAsJson_noSettings() { + // Given + List emptyList = Collections.emptyList(); + when(typedQuery.getResultList()).thenReturn(emptyList); + + // When + JsonObject result = settingsServiceBean.listAllAsJson(); + + // Then + assertEquals(0, result.size()); + } + + @Test + void testListAllAsJson_nonLocalizedSettings() { + // Given + List resultList = List.of( + new Setting("testKey1", "testValue1"), + new Setting("testKey2", "12345") + ); + when(typedQuery.getResultList()).thenReturn(resultList); + + // When + JsonObject result = settingsServiceBean.listAllAsJson(); + + // Then + assertEquals(2, result.size()); + assertEquals("testValue1", result.getString("testKey1")); + assertEquals("12345", result.getString("testKey2")); + } + + @Test + void testListAllAsJson_jsonObjectSetting() { + // Given + JsonObject expected = Json.createObjectBuilder() + .add("default", "2147483648") + .add("fileOne", "4000000000") + .add("s3", "8000000000") + .build(); + + List resultList = List.of( + new Setting(SettingsServiceBean.Key.MaxFileUploadSizeInBytes.toString(), "{\"default\":\"2147483648\",\"fileOne\":\"4000000000\",\"s3\":\"8000000000\"}") + ); + when(typedQuery.getResultList()).thenReturn(resultList); + + // When + JsonObject result = settingsServiceBean.listAllAsJson(); + + // Then + assertEquals(1, result.size()); + assertEquals(expected.toString(), result.getJsonObject(SettingsServiceBean.Key.MaxFileUploadSizeInBytes.toString()).toString()); + } + + @Test + void testListAllAsJson_jsonArraySetting() { + // Given + JsonArray expected = Json.createArrayBuilder() + .add(2147483648L) + .add("4000000000") + .add("8000000000") + .build(); + + List resultList = List.of( + new Setting(SettingsServiceBean.Key.MaxFileUploadSizeInBytes.toString(), "[2147483648, \"4000000000\", \"8000000000\"]") + ); + when(typedQuery.getResultList()).thenReturn(resultList); + + // When + JsonObject result = settingsServiceBean.listAllAsJson(); + + // Then + assertEquals(1, result.size()); + assertEquals(expected.toString(), result.getJsonArray(SettingsServiceBean.Key.MaxFileUploadSizeInBytes.toString()).toString()); + } + + @Test + void testListAllAsJson_localizedSettings() { + // Given + List resultList = List.of( + new Setting("localizedKey", "value_base"), + new Setting("localizedKey", "en", "value_en"), + new Setting("localizedKey", "fr", "value_fr") + ); + when(typedQuery.getResultList()).thenReturn(resultList); + + // When + JsonObject result = settingsServiceBean.listAllAsJson(); + + // Then + assertEquals(3, result.size()); + assertEquals("value_base", result.getString("localizedKey")); + assertEquals("value_en", result.getString("localizedKey/lang/en")); + assertEquals("value_fr", result.getString("localizedKey/lang/fr")); + } + } + + @Nested + class ConvertJsonToSettingsTest { + + @Test + void testConvertJsonToSettings_simpleKeyValues() { + // Given + JsonObject input = Json.createObjectBuilder() + .add(":Key1", "Value1") + .add(":Key2", "123456") + // The REST API endpoint presents a JsonObject, which may have number literals in it. + // Check that we can cope with that. + .add(":Key3", 123456) + // Make sure we deal with quotes + .add(":Key4", " Dataverse © 2014-2025") + .build(); + + // When + Set result = SettingsServiceBean.convertJsonToSettings(input); + + // Then + Map expectedResults = Map.of( + ":Key1", "Value1", + ":Key2", "123456", + ":Key3", "123456", + ":Key4", " Dataverse © 2014-2025" + ); + for (Setting setting : result) { + assertEquals(expectedResults.get(setting.getName()), setting.getContent()); + } + } + + @Test + void testConvertJsonToSettings_localizedKeysWithSimpleValues() { + // Given + JsonObject input = Json.createObjectBuilder() + .add(":LocalizedKey/lang/en", "EnglishValue") + .add(":LocalizedKey/lang/fr", "FrenchValue") + .build(); + + // When + Set result = SettingsServiceBean.convertJsonToSettings(input); + + // Then + // Note: we do not verify the content with Setting.equals() - but we are not really interested in it as well. + assertEquals( + Set.of(new Setting(":LocalizedKey", "en", "EnglishValue"), + new Setting(":LocalizedKey", "fr", "FrenchValue") + ), result); + } + + @Test + void testConvertJsonToSettings_emptyJson() { + // Given + JsonObject input = Json.createObjectBuilder().build(); + + // When + Set result = SettingsServiceBean.convertJsonToSettings(input); + + // Then + assertEquals(0, result.size()); + } + + @Test + void testConvertJsonToSettings_complexJsonValue() { + // Given + JsonObject input = Json.createObjectBuilder() + .add( + ":MaxFileUploadSizeInBytes", + Json.createObjectBuilder() + .add("default", "2147483648") + .add("fileOne", "4000000000") + .add("s3", "8000000000") + .build()) + .build(); + + // When + Set result = SettingsServiceBean.convertJsonToSettings(input); + + // Then + assertEquals(1, result.size()); + assertEquals(new Setting(":MaxFileUploadSizeInBytes", + "{\"default\":\"2147483648\",\"fileOne\":\"4000000000\",\"s3\":\"8000000000\"}"), + result.stream().toList().get(0)); + } + + + } + + @Nested + class ReplaceAllSettingsTest { + + static TypedQuery typedQuery = mock(TypedQuery.class); + static EntityManager em = mock(EntityManager.class); + static SettingsServiceBean settingsServiceBean = new SettingsServiceBean(); + + @BeforeAll + static void setup() { + settingsServiceBean.em = em; + + when(em.createNamedQuery( + ArgumentMatchers.eq("Setting.findAll"), + ArgumentMatchers.eq(Setting.class))) + .thenReturn(typedQuery); + } + + @AfterEach + void reset() { + // After each test, we need to clear the invocations for test isolation. + clearInvocations(em); + } + + @Test + void testReplaceAllSettings_null() { + // When/Then + NullPointerException exception = assertThrows(NullPointerException.class, + () -> settingsServiceBean.replaceAllSettings(null)); + assertEquals("The list of new settings cannot be null (it may be empty).", exception.getMessage()); + } + + @Test + void testReplaceAllSettings_updateDeleteCreate() { + // Given + Setting existingSetting1 = new Setting(":Key1", "Value1"); + Setting existingSetting2 = new Setting(":Key2", "Value2"); + Setting newSetting1 = new Setting(":Key1", "UpdatedValue1"); + Setting newSetting3 = new Setting(":Key3", "Value3"); + + when(typedQuery.getResultList()).thenReturn(List.of(existingSetting1, existingSetting2)); + + // When + Map result = settingsServiceBean.replaceAllSettings(Set.of(newSetting1, newSetting3)); + + // Then + assertEquals(3, result.size()); + assertEquals(SettingsServiceBean.Op.UPDATED, result.get(existingSetting1)); + assertEquals(SettingsServiceBean.Op.DELETED, result.get(existingSetting2)); + assertEquals(SettingsServiceBean.Op.CREATED, result.get(newSetting3)); + // We cannot track the em.merge() call in this unit-test, as this happens in ORM code, beyond our reach. + // Thus check the update to the ORM-tracked entity happened. + assertEquals("UpdatedValue1", existingSetting1.getContent()); + + // Verify interactions + verify(em).remove(existingSetting2); + verify(em).persist(newSetting3); + verify(em).flush(); // verify persistence is enforced + } + + @Test + void testReplaceAllSettings_noChanges() { + // Given + Setting existingSetting = new Setting(":Key1", "Value1"); + Setting newSetting = new Setting(":Key1", "Value1"); + + when(typedQuery.getResultList()).thenReturn(List.of(existingSetting)); + + // When + Map result = settingsServiceBean.replaceAllSettings(Set.of(newSetting)); + + // Then + assertEquals(1, result.size()); + assertEquals(SettingsServiceBean.Op.UNCHANGED, result.get(existingSetting)); + + // Verify no interactions causing change + verify(em, never()).persist(any(Setting.class)); + verify(em, never()).remove(any(Setting.class)); + verify(em, never()).merge(any(Setting.class)); + } + } +} \ No newline at end of file diff --git a/src/test/java/edu/harvard/iq/dataverse/util/ListSplitUtilTest.java b/src/test/java/edu/harvard/iq/dataverse/util/ListSplitUtilTest.java new file mode 100644 index 00000000000..9535cb4cca2 --- /dev/null +++ b/src/test/java/edu/harvard/iq/dataverse/util/ListSplitUtilTest.java @@ -0,0 +1,31 @@ +package edu.harvard.iq.dataverse.util; + +import org.junit.jupiter.api.DisplayName; +import org.junit.jupiter.api.Test; + +import java.util.List; +import java.util.Set; + +import static org.junit.jupiter.api.Assertions.*; + +class ListSplitUtilTest { + + @Test + @DisplayName("split preserves empty tokens and quotes") + void testSplitBasic() { + List tokens = ListSplitUtil.split(" a , b, \"c\" , , d "); + assertEquals(List.of("a", "b", "\"c\"", "", "d"), tokens); + } + + @Test + @DisplayName("splitToLowerCaseSet lowercases and de-dups (order not asserted)") + void testSplitToLowerCaseSet() { + assertTrue(ListSplitUtil.splitToLowerCaseSet(null).isEmpty(), "null should yield empty set"); + assertTrue(ListSplitUtil.splitToLowerCaseSet(" ").isEmpty(), "blank should yield empty set"); + Set set = ListSplitUtil.splitToLowerCaseSet("B, a, b, A, C"); + assertEquals(Set.of("b", "a", "c"), set); + + Set quoted = ListSplitUtil.splitToLowerCaseSet("\"A\" , \"b\" , \"A\""); + assertEquals(Set.of("\"a\"", "\"b\""), quoted); + } +} diff --git a/src/test/java/edu/harvard/iq/dataverse/util/SystemConfigTest.java b/src/test/java/edu/harvard/iq/dataverse/util/SystemConfigTest.java index 82b89bca678..06026962d2c 100644 --- a/src/test/java/edu/harvard/iq/dataverse/util/SystemConfigTest.java +++ b/src/test/java/edu/harvard/iq/dataverse/util/SystemConfigTest.java @@ -12,6 +12,8 @@ import org.mockito.Mock; import org.mockito.junit.jupiter.MockitoExtension; +import java.util.Map; + import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertTrue; import static org.mockito.Mockito.doReturn; @@ -142,5 +144,103 @@ void testGetThumbnailSizeLimit() { assertEquals(1000000l, SystemConfig.getThumbnailSizeLimit("PDF")); assertEquals(0l, SystemConfig.getThumbnailSizeLimit("NoSuchType")); } - + + @Test + void testGetTabularIngestSizeLimitsWithoutSetting() { + // given + doReturn(null).when(settingsService).getValueForKey(SettingsServiceBean.Key.TabularIngestSizeLimit); + + // when + Map result = systemConfig.getTabularIngestSizeLimits(); + + // then + assertEquals(1, result.size()); + assertEquals(-1L, (long) result.get(SystemConfig.TABULAR_INGEST_SIZE_LIMITS_DEFAULT_KEY)); + } + + @Test + void testGetTabularIngestSizeLimitsWithValidJson() { + // given + String validJson = "{\"csV\": \"5000\", \"tSv\": \"10000\"}"; + doReturn(validJson).when(settingsService).getValueForKey(SettingsServiceBean.Key.TabularIngestSizeLimit); + + // when + Map result = systemConfig.getTabularIngestSizeLimits(); + + // then + assertEquals(3, result.size()); + assertEquals(-1L, (long) result.get(SystemConfig.TABULAR_INGEST_SIZE_LIMITS_DEFAULT_KEY)); + assertEquals(5000L, result.get("csv")); + assertEquals(10000L, result.get("tsv")); + } + + @Test + void testGetTabularIngestSizeLimitsWithSingleValue() { + // given + String singleValue = "8000"; + doReturn(singleValue).when(settingsService).getValueForKey(SettingsServiceBean.Key.TabularIngestSizeLimit); + + // when + Map result = systemConfig.getTabularIngestSizeLimits(); + + // then + assertEquals(1, result.size()); + assertEquals(8000L, (long) result.get(SystemConfig.TABULAR_INGEST_SIZE_LIMITS_DEFAULT_KEY)); + } + + @Test + void testGetTabularIngestSizeLimitsWithSingleInvalidValue() { + // given + String singleValue = "this-aint-no-number"; + doReturn(singleValue).when(settingsService).getValueForKey(SettingsServiceBean.Key.TabularIngestSizeLimit); + + // when + Map result = systemConfig.getTabularIngestSizeLimits(); + + // then + assertEquals(1, result.size()); + assertEquals(0L, (long) result.get(SystemConfig.TABULAR_INGEST_SIZE_LIMITS_DEFAULT_KEY)); + } + + @Test + void testGetTabularIngestSizeLimitsWithJsonButUnsupportedJsonInt() { + // given + String invalidJson = "{\"default\": 0}"; + doReturn(invalidJson).when(settingsService).getValueForKey(SettingsServiceBean.Key.TabularIngestSizeLimit); + + // when + Map result = systemConfig.getTabularIngestSizeLimits(); + + // then + assertEquals(1, result.size()); + assertEquals(0L, (long) result.get(SystemConfig.TABULAR_INGEST_SIZE_LIMITS_DEFAULT_KEY)); + } + + @Test + void testGetTabularIngestSizeLimitsWithInvalidJson() { + // given + String invalidJson = "{invalid:}"; + doReturn(invalidJson).when(settingsService).getValueForKey(SettingsServiceBean.Key.TabularIngestSizeLimit); + + // when + Map result = systemConfig.getTabularIngestSizeLimits(); + + // then + assertEquals(1, result.size()); + assertEquals(0L, (long) result.get(SystemConfig.TABULAR_INGEST_SIZE_LIMITS_DEFAULT_KEY)); + } + + @Test + void testGetTabularIngestSizeLimitsWithInvalidNumberInValidJson() { + // given + String invalidJson = "{\"csv\": \"this-is-not-a-number\", \"tSv\": \"10000\"}"; + doReturn(invalidJson).when(settingsService).getValueForKey(SettingsServiceBean.Key.TabularIngestSizeLimit); + + // when + Map result = systemConfig.getTabularIngestSizeLimits(); + + // then + assertEquals(1, result.size()); + assertEquals(0L, (long) result.get(SystemConfig.TABULAR_INGEST_SIZE_LIMITS_DEFAULT_KEY)); + } } diff --git a/src/test/java/edu/harvard/iq/dataverse/util/json/JsonUtilTest.java b/src/test/java/edu/harvard/iq/dataverse/util/json/JsonUtilTest.java index 3e4f9a690d2..b703597a91c 100644 --- a/src/test/java/edu/harvard/iq/dataverse/util/json/JsonUtilTest.java +++ b/src/test/java/edu/harvard/iq/dataverse/util/json/JsonUtilTest.java @@ -1,7 +1,15 @@ package edu.harvard.iq.dataverse.util.json; import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertThrows; + +import jakarta.json.JsonException; +import jakarta.json.JsonValue; +import org.junit.jupiter.api.Nested; import org.junit.jupiter.api.Test; +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.NullAndEmptySource; +import org.junit.jupiter.params.provider.ValueSource; class JsonUtilTest { @@ -15,5 +23,32 @@ void testPrettyPrint() { assertEquals("[\n \"junk\"\n]", JsonUtil.prettyPrint("[\"junk\"]")); assertEquals("{\n" + " \"foo\": \"bar\"\n" + "}", JsonUtil.prettyPrint("{\"foo\": \"bar\"}")); } - + + @Nested + class JsonValues { + @Test + void testGetJsonValueWithJsonObject() { + String jsonObject = "{\"key\": \"value\"}"; + JsonValue result = JsonUtil.getJsonValue(jsonObject); + assertEquals(JsonValue.ValueType.OBJECT, result.getValueType()); + assertEquals("value", result.asJsonObject().getString("key")); + } + + @Test + void testGetJsonValueWithJsonArray() { + String jsonArray = "[\"element1\", \"element2\"]"; + JsonValue result = JsonUtil.getJsonValue(jsonArray); + assertEquals(JsonValue.ValueType.ARRAY, result.getValueType()); + assertEquals("element1", result.asJsonArray().getString(0)); + assertEquals("element2", result.asJsonArray().getString(1)); + } + + @ParameterizedTest + @NullAndEmptySource + @ValueSource(strings = {" ", " \"\"", "\"primitive\"", "{invalid}", "[invalid]", "[1234, invalid]"}) + void testGetJsonValueWithInvalidJson(String sut) { + assertThrows(JsonException.class, () -> JsonUtil.getJsonValue(sut)); + } + } + } diff --git a/tests/integration-tests.txt b/tests/integration-tests.txt index 2a15ac3ce74..33d137c893a 100644 --- a/tests/integration-tests.txt +++ b/tests/integration-tests.txt @@ -1 +1 @@ -DataversesIT,DatasetsIT,SwordIT,AdminIT,BuiltinUsersIT,UsersIT,UtilIT,ConfirmEmailIT,FileMetadataIT,FilesIT,SearchIT,InReviewWorkflowIT,HarvestingServerIT,HarvestingClientsIT,MoveIT,MakeDataCountApiIT,FileTypeDetectionIT,EditDDIIT,ExternalToolsIT,AccessIT,DuplicateFilesIT,DownloadFilesIT,LinkIT,DeleteUsersIT,DeactivateUsersIT,AuxiliaryFilesIT,InvalidCharactersIT,LicensesIT,NotificationsIT,BagIT,MetadataBlocksIT,NetcdfIT,SignpostingIT,FitsIT,LogoutIT,DataRetrieverApiIT,ProvIT,S3AccessIT,OpenApiIT,InfoIT,DatasetFieldsIT,SavedSearchIT,DatasetTypesIT,DataverseFeaturedItemsIT,SendFeedbackApiIT,CustomizationIT,JsonLDExportIT +DataversesIT,DatasetsIT,SwordIT,AdminIT,BuiltinUsersIT,UsersIT,UtilIT,ConfirmEmailIT,FileMetadataIT,FilesIT,SearchIT,InReviewWorkflowIT,HarvestingServerIT,HarvestingClientsIT,MoveIT,MakeDataCountApiIT,FileTypeDetectionIT,EditDDIIT,ExternalToolsIT,AccessIT,DuplicateFilesIT,DownloadFilesIT,LinkIT,DeleteUsersIT,DeactivateUsersIT,AuxiliaryFilesIT,InvalidCharactersIT,LicensesIT,NotificationsIT,BagIT,MetadataBlocksIT,NetcdfIT,SignpostingIT,FitsIT,LogoutIT,DataRetrieverApiIT,ProvIT,S3AccessIT,OpenApiIT,InfoIT,DatasetFieldsIT,SavedSearchIT,DatasetTypesIT,DataverseFeaturedItemsIT,SendFeedbackApiIT,CustomizationIT,JsonLDExportIT,WorkflowsIT