Skip to content

Latest commit

 

History

History
235 lines (168 loc) · 7.51 KB

File metadata and controls

235 lines (168 loc) · 7.51 KB

context.md

This file provides guidance to AI assistants when working with code in this repository.

Project Overview

This is an Ansible role for deploying Dataverse, a research data repository platform. The role installs and configures all required components: Apache httpd (reverse proxy), Payara/GlassFish (Java EE application server), PostgreSQL (database), and Solr (search/indexing).

Important: The Dataverse installer is not idempotent. Running the playbook multiple times on the same host will fail. For testing iterations, fully destroy and recreate the environment.

Development Environment Setup

This project uses uv for Python dependency management, and Molecule with Docker for testing.

Initial Setup

# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Set up development environment (installs Python 3.11, creates venv, installs deps)
uv sync

# Install Ansible collections (vendored locally)
make bootstrap

The make bootstrap command installs vendored collections from collections/requirements.yml into ./collections. This project vendors specific versions of community.general (11.2.1) and community.postgresql (4.1.0).

Key files:

  • pyproject.toml - Dependency specifications
  • uv.lock - Locked dependency versions (commit this!)
  • .python-version - Python version (3.11)
  • .venv/ - Virtual environment (git-ignored)

Testing with Molecule

Run Full Test Cycle

uv run molecule test -s rocky9

Iterative Development

# Start/create the container and run the playbook
uv run molecule converge -s rocky9

# Open a shell in the running container
uv run molecule login -s rocky9

# Destroy the container (required between test runs due to non-idempotency)
uv run molecule destroy -s rocky9
# or
uv run molecule reset -s rocky9

# Alternatively, activate the venv and run commands directly
source .venv/bin/activate
molecule converge -s rocky9

Dataverse will be accessible at http://localhost:8080 after a successful converge.

Default credentials:

  • Username: dataverseAdmin
  • Password: see dataverse_adminpass in tests/group_vars/vagrant.yml

Molecule Configuration

The rocky9 scenario (molecule/rocky9/) uses:

  • Driver: Docker with a systemd-enabled Rocky Linux 9 image
  • Inventory: Links to group_vars/ for configuration
  • Playbooks: prepare.yml (setup), converge.yml (main role execution)
  • Port mapping: 8080:8080 for Dataverse API/UI access

Running the Role

Local Development with Vagrant

vagrant up

Access at http://localhost:8880 (or check Vagrantfile for port mappings).

Direct Ansible Playbook Execution

Main playbook: dataverse.pb

# Basic run
ansible-playbook -i <inventory> -e "@defaults/main.yml" dataverse.pb

# With specific tags
ansible-playbook -i <inventory> -e "@defaults/main.yml" dataverse.pb --tags "apache,postgres"

# Local connection mode (for single-host deployments)
ansible-playbook --connection=local -i inventory dataverse.pb -e "@defaults/main.yml"

Architecture

Role Structure

  • tasks/main.yml: Orchestrates all task imports in dependency order
  • tasks/: Individual task files for each component (apache, postgres, solr, payara, etc.)
  • defaults/main.yml: Default configuration variables (override via group_vars or -e)
  • templates/: Jinja2 templates for config files
  • handlers/: Service restart handlers
  • files/: Static files (branding, scripts, etc.)

Key Components and Services

Component Location Service Control
Apache httpd /etc/httpd/conf.d systemctl {start|stop|restart} httpd
Payara (GlassFish) /usr/local/payara5 systemctl {start|stop|restart} payara
PostgreSQL /var/lib/pgsql/*/data/ systemctl {start|stop|restart} postgresql-*
Solr /usr/local/solr systemctl {start|stop|restart} solr

Migration & Environment Documentation

For major version upgrades and multi-environment configuration, refer to these guides:

Configuration Architecture

Configuration hierarchy (highest precedence first):

  1. Extra vars passed with -e at runtime
  2. group_vars/ files (environment-specific)
  3. defaults/main.yml (role defaults)

Key configuration variables:

  • dataverse_branch: Git branch to deploy (default: release)
  • dataverse_repo: Source repository URL
  • apache.ssl.enabled: Enable HTTPS
  • db.postgres.enabled: Enable PostgreSQL installation
  • dataverse.adminpass: Admin user password
  • dataverse.payara.siteurl: Full public URL (critical for OAuth/OIDC)

Vendored Collections

This project vendors Ansible collections locally in ./collections to pin specific versions. The ansible.cfg sets collections_path = ./collections:~/.ansible/collections:/usr/share/ansible/collections.

IMPORTANT: When using modules from these collections, use Fully Qualified Collection Names (FQCN):

  • community.general.apache2_module (not just apache2_module)
  • community.postgresql.postgresql_db (not just postgresql_db)

If you encounter "module not found" errors, verify:

  1. Collections are installed: make bootstrap
  2. FQCNs are used in task files
  3. ansible.cfg collections_path is correct

Ansible Tags

Use tags to run specific portions of the role:

# Install only Apache
ansible-playbook dataverse.pb --tags "apache"

# Install prerequisites and PostgreSQL
ansible-playbook dataverse.pb --tags "prereqs,postgres"

# Skip time-consuming tasks
ansible-playbook dataverse.pb --skip-tags "sampledata"

Common tags: prereqs, apache, postgres, solr, payara, dataverse, shibboleth, sampledata

Git Workflow (UCLA Fork)

Main branch: develop (not main) Branch naming:

  • config/* - Configuration changes (group_vars, defaults)
  • task/* - Role task/template updates
  • doc/* - Documentation

Pull requests: Target develop branch

# Create feature branch
git checkout -b task/fix-payara-config

# Create PR via GitHub CLI
gh pr create --base develop --fill

Supported Platforms

  • RHEL/Rocky Linux 8, 9 (primary support)
  • Debian 11, 12 (supported)

The role auto-detects OS and includes appropriate task files (e.g., postgres_redhat.yml vs postgres_debian.yml).

Troubleshooting

Dataverse Installation Fails

  • Check /usr/local/payara5/glassfish/domains/domain1/logs/server.log
  • Verify PostgreSQL is running: systemctl status postgresql-*
  • Ensure Solr is accessible: curl http://localhost:8983/solr/

Collections Not Found

make bootstrap  # Reinstall vendored collections

Port 8080 Already in Use (Molecule)

Edit molecule/rocky9/molecule.yml published_ports to use a different host port.

Non-Idempotent Errors

The Dataverse installer cannot be run twice. Destroy and recreate:

uv run molecule destroy -s rocky9
uv run molecule converge -s rocky9

Dependency Management

Add a new dependency:

uv add package-name
# or with version
uv add ansible-core@2.17.0

Update dependencies:

uv lock --upgrade
uv sync

After pulling changes:

uv sync  # Syncs .venv with uv.lock