CSUC
diff --git a/‎README.md‎
Lines changed: 14 additions & 7 deletions b/‎README.md‎
Lines changed: 14 additions & 7 deletions
diff --git a/‎README_ENG.md‎
Lines changed: 14 additions & 6 deletions b/‎README_ENG.md‎
Lines changed: 14 additions & 6 deletions
diff --git a/‎REVISAT/README.md‎
Lines changed: 2 additions & 0 deletions b/‎REVISAT/README.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎REVISAT/README_ENG.md‎
Lines changed: 2 additions & 0 deletions b/‎REVISAT/README_ENG.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎REVISAT/REVISAT.py‎
Lines changed: 181 additions & 0 deletions b/‎REVISAT/REVISAT.py‎
Lines changed: 181 additions & 0 deletions
diff --git a/‎REVISAT/REVISAT_script.ipynb‎
Lines changed: 52 additions & 39 deletions b/‎REVISAT/REVISAT_script.ipynb‎
Lines changed: 52 additions & 39 deletions
diff --git a/‎change_CSV_delimiter/README.md‎
Lines changed: 1 addition & 0 deletions b/‎change_CSV_delimiter/README.md‎
Lines changed: 1 addition & 0 deletions
@@ -8,13 +8,20 @@
 En aquest repositori el [Consorci de Serveis Universitaris de Catalunya (CSUC)](https://www.csuc.cat/ca) publica scripts que les institucions i els usuaris del [Repositori de Dades de Recerca](https://dataverse.csuc.cat/) poden fer servir per realitzar tasques de forma automatitzada. Tots els scripts requereixen usar l'[API de Dataverse](https://guides.dataverse.org/en/latest/api/).
 
 ## Descripció dels scripts
- 
-- **Crear fitxers README.txt**: Aquest script permet crear un fitxer README automaticament a partir de les metadades d'un dataset depositat al repositori Dataverse.
-- **Pujada automàtica de fitxers**: Aquest script permet pujar fitxers automàticament a un repositori Dataverse. 
-- **Moure datasets entre instàncies**: Aquest script permet moure datasets entre instàncies d'un repositori Dataverse.
-- **Extreure metadades en un fitxer tabular**: Aquest script permet descarregar les metadades d'un conjunt de dades en format tabular.
-- **REVISAT**:  Aquest script automatitza i facilita la revisió d'un dataset fent servir la majoria dels criteris del [REVISAT](https://confluence.csuc.cat/display/RDM/REVISAT).
-- **Descarregar datasets sencers**: Aquest script permet descarregar un conjunt de dades d'un repositori Dataverse.
+
+- **REVISAT**: Automatitza la revisió d’un dataset segons els criteris del [REVISAT](https://confluence.csuc.cat/display/RDM/REVISAT).
+- **change_CSV_delimiter**: Canvia el delimitador dels fitxers CSV.
+- **create_Readme**: Genera automàticament un fitxer README a partir de les metadades del dataset.
+- **dataset_size_calculator**: Calcula la mida total d’un conjunt de dades.
+- **extract_metadata**: Extreu metadades de datasets i les desa en format tabular.
+- **metrics**: Obté mètriques d’ús o descàrregues dels datasets.
+- **move_dataset**: Permet moure datasets entre diferents instàncies de Dataverse.
+- **multiple_datasets_metadata**: Extreu metadades de múltiples datasets de manera massiva.
+- **persistent_link**: Comprova i mostra l’enllaç persistent correcte d’un dataset o fitxer.
+- **related_publication_check**: Comprova si un dataset té una publicació relacionada vinculada correctament.
+- **transform_excel**: Transforma fitxers Excel segons formats compatibles amb el repositori.
+- **upload_files**: Automatitza la pujada de fitxers a Dataverse.
+- **verification_readme**: Verifica si el fitxer README és a dins dels datasets d'una instància.
 
 ## Contacte
 
 
@@ -8,12 +8,20 @@ In this repository the [Consortium of University Services of Catalonia (CSUC)](h
 
 ## Description of the scripts
 
-- **Create README.txt files**: This script creates a README.txt file automatically using the metadata in a dataset record.
-- **Automatic file upload**: This script uploads files automatically to a dataset record in a Dataverse repository.
-- **Moving datasets between Dataverses**: This script moves datasets between different Dataverses in a Dataverse repository.
-- **Extract metadata in a tabular file**: This script downloads the metadata of a dataset in tabular format.
-- **REVISAT**: This script automates and facilitates the review of a dataset using most of the [REVISAT criteria](https://confluence.csuc.cat/display/RDM/REVISAT).
-- **Download full datasets**: This script downloads a dataset from a Dataverse repository.
+- **REVISAT**: Automates the review of a dataset based on the [REVISAT](https://confluence.csuc.cat/display/RDM/REVISAT) checklist.
+- **change_CSV_delimiter**: Changes the delimiter of CSV files.
+- **create_Readme**: Automatically generates a README file based on the dataset metadata.
+- **dataset_size_calculator**: Calculates the total size of a dataset.
+- **extract_metadata**: Extracts metadata from datasets and saves it in tabular format.
+- **metrics**: Retrieves usage or download metrics for datasets.
+- **move_dataset**: Allows moving datasets between different Dataverse instances.
+- **multiple_datasets_metadata**: Extracts metadata from multiple datasets in bulk.
+- **persistent_link**: Checks and displays the correct persistent link of a dataset or file.
+- **related_publication_check**: Checks whether a dataset has a properly linked related publication.
+- **transform_excel**: Transforms Excel files into formats compatible with the repository.
+- **upload_files**: Automates the upload of files to Dataverse.
+- **verification_readme**: Verifies whether the README file is included in the datasets of an instance.
+
 
 ## Contact
 If you have questions or comments about these scripts open an issue or send an e-mail to <aco@csuc.cat>.
@@ -1,5 +1,7 @@
 [![ca](https://img.shields.io/badge/lang-ca-blue.svg)](https://github.com/CSUC/RDR-scripts/blob/main/REVISAT/README.md)
 [![en](https://img.shields.io/badge/lang-en-green.svg)](https://github.com/CSUC/RDR-scripts/blob/main/REVISAT/README_ENG.md)
+[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/CSUC/RDR-scripts/blob/main/REVISAT/REVISAT_script.ipynb)
+
 # Script d'Avaluació de datasets (REVISAT)
 Per a qualsevol consulta sobre el codi, poseu-vos en contacte amb rdr-contacte@csuc.cat
 
 
@@ -1,5 +1,7 @@
 [![ca](https://img.shields.io/badge/lang-ca-blue.svg)](https://github.com/CSUC/RDR-scripts/blob/main/REVISAT/README.md)
 [![en](https://img.shields.io/badge/lang-en-green.svg)](https://github.com/CSUC/RDR-scripts/blob/main/REVISAT/README_ENG.md)
+[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/CSUC/RDR-scripts/blob/main/REVISAT/REVISAT_script.ipynb)
+
 # Dataset Evaluation Script (REVISAT / CURATED)
 For any queries regarding the code, contact rdr-contacte@csuc.cat
 
 
@@ -0,0 +1,181 @@
+# =======================
+# CONFIGURATION PARAMETERS
+# =======================
+doi = ""  # Full DOI, e.g., "doi:10.34810/data123456"
+token = ""  # API token from https://dataverse.csuc.cat/dataverseuser.xhtml?selectTab=apiTokenTab
+driver = None  # Use: webdriver.Chrome(), webdriver.Firefox(), or None
+opcions = [
+    "Universitat Rovira i Virgili",
+    "Universitat Pompeu Fabra",
+    "Universitat Oberta de Catalunya",
+    "Vall d’Hebron Institut de Recerca",
+    "Centre for Research on Ecology and Forestry Applications",
+    "Universitat Ramon Llull",
+    "Consorci Institut D'Investigacions Biomèdiques August Pi i Sunyer",
+    "Centre de Recerca en Agrigenòmica",
+    "Institut Català de Nanociència i Nanotecnologia",
+    "Institut de Recerca Sant Joan de Déu",
+    "Universitat Autònoma de Barcelona",
+    "Universitat Politècnica de Catalunya",
+    "Consorci de Serveis Universitaris de Catalunya",
+    "Institut de Física d'Altes Energies",
+    "Universitat Internacional de Catalunya",
+    "Centre de Recerca Matemàtica",
+    "Institut d'Investigació Biomèdica de Bellvitge",
+    "Universitat de Lleida",
+    "Universitat de Girona",
+    "i2CAT",
+    "Institut de Recerca i Tecnologia Agroalimentàries",
+    "Fundación Josep Carreras Contra la Leucemia",
+    "Centre for Demographic Studies",
+    "Centre Tecnològic Forestal de Catalunya",
+    "Universitat de Vic - Universitat Central de Catalunya",
+    "IrsiCaixa",
+    "Institute for Bioengineering of Catalonia",
+    "Biomedical Research Institute of Lleida",
+    "Institut Barcelona d'Estudis Internacionals",
+    "Barcelona University",
+    "Catalan Institute for Water Research",
+    "Institute of Research and Innovation Parc Taulí",
+    "Institut Català de Paleoecologia Humana i Evolució Social",
+    "Universitat de les Illes Balears",
+    "Institute of Photonic Sciences",
+    "Institute for Research in Biomedicine",
+    "Agrotecnio - Centre for Food and Agriculture Research",
+    "Institut d'Investigació Biomèdica de Girona",
+    "Institut Català d'Arqueologia Clàssica",
+    "Barcelona Institute for Global Health"
+]
+
+
+# =======================
+# IMPORTS & INSTALLATION
+# =======================
+import os
+import sys
+import subprocess
+from datetime import date
+from pyDataverse.api import NativeApi
+from selenium import webdriver
+from selenium.webdriver.common.by import By
+from collections import Counter
+from IPython.display import HTML, display
+
+# Install necessary packages if running interactively (optional)
+def install_packages():
+    subprocess.check_call([sys.executable, "-m", "pip", "install", "--upgrade", "pip"])
+    subprocess.check_call([sys.executable, "-m", "pip", "install", "pyDataverse"])
+    subprocess.check_call([sys.executable, "-m", "pip", "install", "selenium"])
+    subprocess.check_call([sys.executable, "-m", "pip", "install", "--upgrade", "tensorflow-probability"])
+
+# =======================
+# MAIN FUNCTION
+# =======================
+def Meta(doi, token, driver, opcions):
+    today = date.today()
+    print("Data:", today)
+
+    base_url = 'https://dataverse.csuc.cat/'
+    api = NativeApi(base_url, token)
+    Metadata = api.get_dataset(doi)
+
+    fields_metadata = Metadata.json()["data"]["latestVersion"]["metadataBlocks"]["citation"]["fields"]
+    metadata_repositori = [field["typeName"] for field in fields_metadata]
+
+    Metadata_min_req = ['title', 'datasetContact', 'dsDescription', 'keyword', 'subject', 'kindOfData', 'author']
+    intersect_metadata = list(set(metadata_repositori) & set(Metadata_min_req))
+    same_metadata = len(list(set(Metadata_min_req) ^ set(intersect_metadata)))
+
+    print("\nConté les metadades mínimes obligatòries?")
+    if same_metadata != 0:
+        print("NO", list(set(Metadata_min_req) ^ set(intersect_metadata)))
+    else:
+        print("SÍ")
+
+    # Title
+    index_title = metadata_repositori.index('title')
+    titol = fields_metadata[index_title]["value"]
+    print("\nTítol dataset:\n{}\n".format(titol))
+
+    titol_1 = titol.split(":")
+
+    # Related publication
+    print("En el cas que el dataset tingui una publicació relacionada, inclou la citació?")
+    if 'publication' in metadata_repositori:
+        print("SÍ")
+        index_publication = metadata_repositori.index('publication')
+        Rel_pub = [pub["publicationCitation"]["value"] for pub in fields_metadata[index_publication]["value"]]
+        for citation in Rel_pub:
+            print(citation)
+
+        if "Replication Data for" in titol_1[0] and len(titol_1) > 1:
+            only_title = titol[21:]
+            print("\nEl títol inclou: Replication data for")
+            for i in Rel_pub[0].split("."):
+                if only_title.casefold() == i.strip().casefold():
+                    print("Els títols coincideixen")
+        else:
+            print("\nNo és rèplica de l'article")
+    else:
+        print("\nNo té publicacions relacionades")
+
+    # Author info
+    index_author = metadata_repositori.index('author')
+    author_id = []
+    afiliacion = []
+    institucion = []
+
+    for author in fields_metadata[index_author]["value"]:
+        aff = author.get("authorAffiliation", {})
+        aff_val = aff.get("expandedvalue", {}).get("termName") or aff.get("value")
+        if aff_val:
+            afiliacion.append(aff_val)
+
+        if "authorIdentifier" in author:
+            author_id.append("SÍ")
+
+    for aff in afiliacion:
+        matched = any(inst in aff for inst in opcions)
+        institucion.append("SÍ" if matched else "NO")
+
+    print("\nAlmenys un/a dels/les autors/es pertany a la institució on es diposita:", "SÍ" if "SÍ" in institucion else "NO")
+    print("Almenys un/a dels/les autors/es informa del seu ORCID?")
+    print("ORCID: ", "SÍ" if "SÍ" in author_id else "NO")
+
+    # Description
+    index_descripcion = metadata_repositori.index('dsDescription')
+    descripcion = fields_metadata[index_descripcion]["value"][0]['dsDescriptionValue']["value"]
+    print("\nDescripció:\n", descripcion)
+
+    # File formats
+    print("\nFormat de fitxers")
+    total_files = len(Metadata.json()['data']['latestVersion']['files'])
+    files = [file['dataFile']['filename'] for file in Metadata.json()['data']['latestVersion']['files']]
+    extensions = [os.path.splitext(f)[1] for f in files]
+    print(Counter(extensions))
+
+    lowercase_files = [f.lower() for f in files]
+    if "readme.txt" in lowercase_files:
+        print("Sí que conté el fitxer readme.txt")
+
+    # License
+    print("\nLlicència:")
+    license_info = Metadata.json()["data"]['latestVersion'].get("license", {}).get("name") \
+        or Metadata.json()["data"]['latestVersion'].get('termsOfUse')
+    print(license_info)
+
+    # F-UJI
+    if driver is None:
+        print("\nAvalueu el dataset manualment a F-UJI: https://www.f-uji.net/")
+    else:
+        driver.get("https://www.f-uji.net/")
+        driver.find_element(By.XPATH, '/html/body/div[1]/div[1]/div/p/a').click()
+        driver.find_element(By.XPATH, '//*[@id="pid"]').send_keys(doi)
+        driver.find_element(By.XPATH, '//*[@id="assessment_form"]/div/form/div[4]/button').click()
+
+
+# =======================
+# EXECUTE SCRIPT
+# =======================
+if __name__ == "__main__":
+    Meta(doi, token, driver, opcions)
@@ -2,50 +2,26 @@
   "cells": [
     {
       "cell_type": "markdown",
-      "id": "7852b47e-53f8-48ca-ae07-79f98e37dba2",
       "metadata": {
-        "id": "7852b47e-53f8-48ca-ae07-79f98e37dba2"
+        "id": "view-in-github",
+        "colab_type": "text"
       },
       "source": [
-        "## REVISAT\n",
-        "REVISAT is a script that allows reviewing a dataset, in draft, before being published, to ensure compliance with good open access practices. It is a first version, and as the repository software is updated and/or metadata is updated, the script will be changed accordingly.\n",
-        "If you as a user have any doubts about the operation, proposal, or suggestion for improvement and want to incorporate it into the script, please write to us at: rdr-contacte@csuc.cat\n",
-        "\n",
-        "Last updated: 2023-11-14"
+        "<a href=\"https://colab.research.google.com/github/CSUC/RDR-scripts/blob/main/REVISAT/REVISAT_script.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
       ]
     },
     {
-      "cell_type": "code",
-      "execution_count": null,
-      "id": "3fff23c3-7143-41d6-9123-a55fcfb4b596",
+      "cell_type": "markdown",
+      "id": "7852b47e-53f8-48ca-ae07-79f98e37dba2",
       "metadata": {
-        "cellView": "form",
-        "id": "3fff23c3-7143-41d6-9123-a55fcfb4b596"
+        "id": "7852b47e-53f8-48ca-ae07-79f98e37dba2"
       },
-      "outputs": [],
       "source": [
-        "# @title Install or update libraries (Click execution button &#x25B6; )\n",
-        "import ipywidgets as widgets\n",
-        "from IPython.display import display, HTML, clear_output\n",
+        "## REVISAT\n",
+        "REVISAT is a script that allows reviewing a dataset, in draft, before being published, to ensure compliance with good open access practices. It is a first version, and as the repository software is updated and/or metadata is updated, the script will be changed accordingly.\n",
+        "If you as a user have any doubts about the operation, proposal, or suggestion for improvement and want to incorporate it into the script, please write to us at: rdr-contacte@csuc.cat\n",
         "\n",
-        "# Function to install required packages\n",
-        "def install_packages(b):\n",
-        "    clear_output(wait=True)\n",
-        "    !pip install --upgrade pip -q\n",
-        "    !pip --upgrade tensorflow-probability -q\n",
-        "    !pip install pyDataverse -q\n",
-        "    !pip install selenium -q\n",
-        "    print(\"S'han descarregat o actualitzat les llibreries.\")\n",
-        "\n",
-        "# Displaying installation message\n",
-        "display(HTML(\"<p style='font-size:14px;'><b>Feu clic al botó següent per instal·lar les llibreries.</b></p>\"))\n",
-        "\n",
-        "# Creating installation button\n",
-        "install_button = widgets.Button(description='Instal·lar llibreries')\n",
-        "install_button.on_click(install_packages)\n",
-        "\n",
-        "# Displaying the installation button\n",
-        "display(install_button)"
+        "Last updated: 2025-03-25"
       ]
     },
     {
@@ -58,15 +34,51 @@
       },
       "outputs": [],
       "source": [
-        "# @title Introduir DOI (doi:10.34810/dataXXX), el token i el nom complet de la institució. Clicar botó d'executar cel·la &#x25B6;\n",
+        "# @title First enter the token (If you don't have your API token, you can get it from the following link <a href='https://dataverse.csuc.cat/dataverseuser.xhtml?selectTab=apiTokenTab' target='_blank'>Get API Token</a>).</p> After that, enter the LAST DIGITS of the DOI (for example, if the DOI ends in <strong>dataXYZ</strong>, only write the number <strong>XYZ</strong> ).</p> Finally, click the &#x25B6; button to execute the script.\n",
+        "import os\n",
+        "import subprocess\n",
+        "import sys\n",
+        "\n",
+        "# Function to install required packages\n",
+        "def install_packages():\n",
+        "    \"\"\"\n",
+        "    Function to install or update necessary Python packages.\n",
+        "    \"\"\"\n",
+        "    # Upgrade pip first\n",
+        "    subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", \"--upgrade\", \"pip\", \"-q\"])\n",
+        "\n",
+        "    # Install the required libraries\n",
+        "    subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", \"pyDataverse\", \"-q\"])\n",
+        "    subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", \"selenium\", \"-q\"])\n",
+        "    subprocess.check_call([sys.executable, \"-m\", \"pip\", \"install\", \"pyDataverse\", \"-q\"])\n",
+        "    subprocess.check_call([sys.executable, \"-m\", \"pip\", \"--upgrade\", \"tensorflow-probability\", \"-q\"])\n",
+        "\n",
+        "\n",
+        "    print(\"Libraries have been downloaded or updated.\")\n",
+        "\n",
+        "# Install libraries if they are not installed already\n",
+        "try:\n",
+        "    import pyDataverse\n",
+        "except ImportError:\n",
+        "    print(\"Installing libraries...\")\n",
+        "    install_packages()\n",
+        "\n",
+        "try:\n",
+        "    import google.colab\n",
+        "    IN_COLAB = True\n",
+        "except ImportError:\n",
+        "    IN_COLAB = False\n",
+        "\n",
+        "from google.colab import output\n",
+        "import ipywidgets as widgets\n",
+        "from IPython.display import display, HTML, clear_output\n",
+        "\n",
         "from datetime import date\n",
         "from pyDataverse.api import NativeApi, DataAccessApi\n",
         "from selenium import webdriver\n",
         "from selenium.webdriver.common.keys import Keys\n",
         "from selenium.webdriver.common.by import By\n",
-        "import sys\n",
         "import numpy as np\n",
-        "import os\n",
         "from collections import Counter\n",
         "import textwrap\n",
         "import pprint\n",
@@ -75,7 +87,7 @@
         "identifier = \"\"  # @param {type:\"string\"}\n",
         "token = \"\"  # @param {type:\"string\"}\n",
         "driver = None ## triar (webdriver.Chrome(), webdriver.Firefox() or None) per evaluar el daset a F-uji. Trieu None si useu l'script a Colab.\n",
-        "doi = identifier\n",
+        "doi = 'doi:10.34810/data'+identifier\n",
         "\n",
         "#Choose an institution\n",
         "institucions = [\n",
@@ -352,7 +364,8 @@
   ],
   "metadata": {
     "colab": {
-      "provenance": []
+      "provenance": [],
+      "include_colab_link": true
     },
     "kernelspec": {
       "display_name": "Python 3 (ipykernel)",
 
@@ -1,5 +1,6 @@
 [![ca](https://img.shields.io/badge/lang-ca-blue.svg)](https://github.com/CSUC/RDR-scripts/blob/main/change_CSV_delimiter/README.md)
 [![en](https://img.shields.io/badge/lang-en-green.svg)](https://github.com/CSUC/RDR-scripts/blob/main/change_CSV_delimiter/README_ENG.md)
+[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/CSUC/RDR-scripts/blob/main/change_CSV_delimiter/csv_delimiter_converter.ipynb)
 
 # Convertidor de Delimitador CSV