Skip to content

Latest commit

 

History

History
807 lines (576 loc) · 34.5 KB

File metadata and controls

807 lines (576 loc) · 34.5 KB

Lantern - Development

Local development environment

Requirements:

Setup:

  1. install tools (brew install git uv pre-commit 1password-cli)
  2. clone and setup project [2]
  3. configure app [3]
  4. generate an .env file

[2]

% git clone https://gitlab.data.bas.ac.uk/MAGIC/lantern-exp.git
% cd lantern-exp/
% pre-commit install
% uv sync --all-groups
% uv run playwright install

Local development publishing

To run BAS Data Catalogue Publishing Workflows locally:

  1. manually create a named AWS IAM user (lantern-$USER e.g. lantern-conwat) and add to the lantern-local-dev group to manage content in AWS S3 Content Buckets
  2. as a GitLab administrator impersonating the GitLab bot user, create a Personal Access Token 🔒:
    • token name: lantern-conwat
    • scopes: api
  3. for Trusted Publishing:
    • ensure you can access the catalogue directory within the Ops Data Store web root
    • if needed, create an SSH key and add the public key to the relevant remote authorised keys file
    • if needed, configure SSH to access the relevant server with the relevant credentials automatically [1]
    • if needed, set the umask for this user to 0002 to allow group write permissions on POSIX filesystems
  4. set the relevant Config options in your local .env file

[1] E.g. in ~/.ssh/config:

Host lantern-trusted-content
    HostName server.example.com
    User conwat
    IdentityFile ~/.ssh/lantern_trusted_content.pub

Needed for rsync to automatically authenticate.

Tip

TheIdentityFile is a public key as a hint for the 1Password SSH agent.

Local development stack

Local versions of services used by Lantern can be run via containers.

Requirements:

Setup:

  • install tools brew install --cask orbstack opentofu

Start:

  • start stack using the stack-start Development Task
  • Run GitLab manual pre-configuration
  • Run local Infrastructure as Code using OpenTofu [2]
  • Run GitLab manual post-configuration

Stop:

Reset:

% task stack-stop
% rm -rf ./resources/dev/gitlab/data/
% task stack-start

[2]

% cd ./resources/dev
% tofu init
% GITLAB_TOKEN=xxx tofu apply

Local development web server

Part of Local Development Stack for the BAS Data Catalogue.

For simulating secure content hosting for Trusted Publishing.

Set relevant Config Options in .env file.

Once the Local Stack is up, visit web.dev.orb.local/cat/stage/items.

Local development load balancer

Part of Local Development Stack for the BAS Data Catalogue.

For simulating reverse proxying similar to the BAS HAProxy Load Balancer.

To externally test HAProxy request handling:

Tip

Add a x-use-aws header (with any/no value) to test public hosting using the non-local AWS testing environment.

To internally test backends within HAProxy using the OrbStack terminal:

# 'lantern_public_exported' backend: Public static hosting (via `serve` dev task on host)
> curl http://host.docker.internal:9000/static/txt/heartbeat.txt

# 'lantern_public_published' backend: Public static hosting (via AWS testing environment)
> curl -H "x-use-aws: true" http://host.docker.internal:9000/static/txt/heartbeat.txt

# 'lantern_secure' backend: Secure hosting (via `web` service in Local Stack)
> curl -I -u user:password https://haproxy.dev.orb.local/-/items/

# 'dms_fallback' backend: (DMS/non-proxied error)
> curl http://localhost/x

To view HAProxy internal stats:

Local development GitLab

Part of Local Development Stack.

Manual pre-configuration:

Manual post-configuration:

Tip

The truststore dev dependency is used to trust the OrbStack local CA in the GitLab Store.

[1] Root password:

% docker compose -f ./resources/dev/docker-compose.yml exec -it gitlab grep 'Password:' /etc/gitlab/initial_root_password

Development tasks

Taskipy is used to define development tasks, such as running tests and rebuilding site styles. These tasks are akin to NPM scripts or similar concepts.

Run task --list (or uv run task --list) for available commands.

Run task [task] (uv run task [task]) to run a specific task.

See Adding development tasks for how to add new tasks.

Tip

If offline, use uv run --offline task ... to avoid lookup errors trying to the unconstrained build system requirements in pyproject.toml, which is a Known Issue within UV.

Development tasks config

The tasks._record_utils.ExtraConfig class extends the main lantern.Config class with extra config options used by development tasks.

Note

These extra variables are prefixed with X rather than LANTERN_.

Option Type Sensitive Since Version Summary Default Example
X_ADMIN_METADATA_SIGNING_KEY_PRIVATE JSON Web Key Yes v0.4.x JSON Web Key (JWK) for updating administrative metadata None '{"kid": "magic_metadata_signing_key", ...}'
X_AGOL_CLIENT_ID String No v0.6.x Client ID for AGOL OAuth application for accessing and updating items None 'xxx'
X_AGOL_CLIENT_SECRET String Yes v0.6.x Client secret for AGOL OAuth application for accessing and updating items None 'xxx'

Warning

These config options are not validated. Manually ensure any options needed by a task are set.

Record upgrade tasks

Warning

This section is Work in Progress (WIP) and may not be complete/accurate.

A mini-framework is available for bulk updating records. It consists of:

  1. a RecordUpgrade class which modifies a record as needed and tracks changes
  2. a RecordsReport class for compiling changes made to a set of records into a Markdown formatted report
  3. a RecordsIO class for reading and writing records, initially from a Store, then to a local directory for processing
  4. a Upgradamatron class for coordinating an overall upgrade, inc. initialising, upgrading and reporting

These classes are wrapped in a development task configured with a relevant local directory and name (YYYY-MM).

Each upgrade SHOULD use a separate development task with copies of these classes, modified as needed. Only the most recent development task SHOULD be kept but all MUST be checked in for future reference.

Upgrade methods should be largely atomic, focusing on one logical change. Changes MUST be tracked for inclusion in the upgrade report. Non-changes SHOULD be tracked via the logger at debug level for troubleshooting.

Tip

Typically only the RecordUpgrade class will need updating.

See Usage documentation for more information on running an upgrade.

Previous upgrades for reference:

Contributing

All changes except minor tweaks (typos, comments, etc.) MUST:

  • be associated with an issue (either directly or by reference)
  • be included in the Change Log

Conventions

  • all deployable code should be contained in the lantern package
  • use Path.resolve() if displaying or logging file/directory paths
  • use logging to record how actions progress, using the app logger (logger = logging.getLogger('app'))
  • extensions to third party dependencies should be:
    • created in lantern.lib
    • documented in Libraries
    • tested in tests.lib_tests/

Adding configuration options

In the lantern.Config class:

  • define a new property with a relevant validation method if needed
  • add property to ConfigDumpSafe typed dict
  • add property to dumps_safe() method

In the Configuration documentation:

  • add to the Options Table (in alphabetical order)
  • if needed, add a subsection to explain the option in more detail

If configurable:

  • update the .env.tpl template and any existing .env files
  • update the [tool.pytest_env] section in pyproject.toml

In the tests.lantern_tests.config module:

  • update the expected response in the test_dumps_safe method
  • if validated, update and/or add test_validate_ tests as needed
  • if configurable, update the test_configurable_property method
  • update or create other tests as needed

Adding catalogue item types

Warning

This section is Work in Progress (WIP) and may not be complete/accurate.

Agree the use of new types:

  1. if types are not members of the ISO 19115 MD_ScopeCode code list, create and agree a proposal to add locally in the BAS Metadata Standards 🛡️ project
  2. revise requirement 03 in the MAGIC Discovery Profile and map to a Super Type via a proposal in the MAGIC Data Management 🛡️ project

Update record schemas in the BAS Metadata Library to allow the new types in records:

  1. if needed, add types to the hierarchy_level enum in the ISO 19115 JSON Schema
  2. add the types to the relevant super type enum in the MAGIC Discovery Profile JSON Schema
  3. release a new version of the library and update the dependency in this project

Within this project, for each new item type:

  • if a new local level, update the lantern.lib.metadata_library.models.record.enums.HierarchyLevelCode enum
  • update the lantern.models.item.base.enums.ResourceTypeLabel enum to set a formatted value/label
  • update the lantern.models.item.catalogue.enums.ResourceTypeIcon enum to set an accompanying icon
  • if the new type is a 'container' Super Type:
    • add the HierarchyLevelCode member to the lantern.models.item.catalogue.const.CONTAINER_SUPER_TYPES list
    • update tests.lantern_tests.models.item.catalogue.test_item_catalogue.TestItemCatalogue.test_super_type
  • if the new type is included in citations:
    • update the lantern.lib.metadata_library.models.record.presets.citation.CitationHierarchyLevelCode enum
    • update tests.lib_tests.metadata_library.models.record.presets.test_citation.TestMakeMagicCitation.test_citation
  • if the new type introduces a new Item Alias Prefix:
    • update the prefixes mapping in lantern.models.record.record.Record._validate_aliases() to set allowed aliases
    • update the allowed prefixes table in the Record requirements docs
    • add new paths for prefixes in the OpenAPI Definition
    • update static site endpoints in Reverse Proxying and resources/dev/haproxy to include new prefixes, and request updating the BAS Load Balancer config to match
  • if the new type introduces new item relationships:
    • add relevant properties to lantern.models.item.catalogue.elements.Aggregations
    • call new properties in lantern.resources.templates._macros.related
    • add tests as needed in:
      • tests.lantern_tests.models.item.catalogue.test_elements.TestAggregations
      • tests.lantern_tests.templates.macros.test_tabs.TestRelatedTab
  • add a new Test Record using the new item type, aliases and/or relationships as applicable
  • verify new behaviour in a local site build

Adding properties to items

Warning

This section is Work in Progress (WIP) and may not be complete/accurate.

  1. if needed, Support New Record Properties
  2. if needed, update Item classes to process new and/or existing properties
    • existing properties may need updating such as ItemBase.kv handling
  3. add new properties to the relevant item tab class in lantern.models.item.catalogue.tabs
    • work backwards to include additional Record properties in the main lantern.models.item.catalogue class
    • and/or lantern.models.item.catalogue.elements classes
    • amend tests that directly instantiate these classes to include the new property
      • some of these are not obvious where kwargs are used to pass properties such as: lantern.models.item.catalogue.special.physical_map.AdditionalInfoTab
  4. update the Site Template to include the new property as needed
  5. add tests as needed for:
    • Record properties
    • Item properties
    • Item Catalogue tab, element and base classes
    • Item templates (static HTML tests and Playwright if needed)
  6. if needed, update lantern.models.checks.RecordChecks to generate checks for the new property
  7. update any relevant record authoring guides to explain how new properties are handled by the Catalogue
  8. if a property is required for all items:
    • update the Record Requirements documentation
    • in future this may include updating a corresponding JSON Schema too
  9. amend list of unsupported properties in /docs/models.md#catalogue-item-limitations as needed

Adding distribution formats

  1. if needed, register new media-types under the Metadata Standards resources site (metadata-resources.data.bas.ac.uk)
  2. create a new class under lantern.models.item.catalogue.distributions, inheriting from Distribution or a relevant subclass and tests under tests.lantern_tests.models.item.catalogue.test_distributions
  3. configure the new distribution format class:
    • set the matches class method to determine a exclusive match for the distribution (typically via media types)
    • add an item to the lantern.models.item.catalogue.enums.DistributionType enum for the distribution type
  4. add the new class to the lantern.models.item.catalogue.tabs.DataTab._supported_distributions list
  5. if the distribution should use a collapsible information panel, edit the src/lantern/resources/templates/_macros/_tabs/data.html.j2 macros in the Site Templates:
    • create a new macro for the distribution format
    • update the panel macro to call the new macro
    • update the tests.lantern_tests.templates.macros.test_tabs.TestDataTab.test_data_info tests
  6. include the new distribution format in Test Records:
    • tests.resources.records/item_cat_data::record
    • tests.resources.records/item_cat_checks::record
  7. include the distribution format in the lantern.models.checks.DistributionChecks class and add tests
  8. if needed, add a new enum member for the check type in lantern.models.checks.CheckType
  9. if needed, add check logic to lantern.checks.CheckRunner and add tests
  10. update the Item distribution options docs

Adding catalogue item tabs

Warning

This section is Work in Progress (WIP) and may not be complete/accurate.

  1. create a new class in lantern.models.item.catalogue.tabs inheriting from lantern.models.item.catalogue.tabs.Tab
    • set the anchor, title, icon properties
    • add tab specific properties and logic
    • create catalogue specif elements or element subclasses in lantern.models.item.catalogue.elements as needed
    • set logic for the enables property based on relevant tab properties
  2. in lantern.models.item.catalogue.item.ItemCatalogue:
    • add a private property returning an instance of the tab class
    • call property in tabs property
    • update default_tab_anchor property if tab is optional and/or should be shown before additional information tab
  3. create a new macro in lantern.resources.templates._macros.tabs named after the tab anchor
    • ensure the section ID attribute is set correctly for tab navigation to work
    • populate tab macro as needed
    • create additional macros nearby (if one or two), or under lantern.resources.templates._macros._tabs
  4. run tailwind Development Task to update styles (for tab classes to work)
  5. update lantern_tests.models.item.catalogue.test_tabs to cover new tab class
  6. update lantern_tests.models.item.catalogue.test_item_catalogue.TestItemCatalogue.test_tabs to include new tab class
  7. update TestItemCatalogue.test_default_tab_anchor if tab should be shown before additional information tab
  8. updates tests within lantern_tests.templates as needed

Adding catalogue licences

  1. update lantern.models.item.catalogue.enums.Licence
  2. in src/lantern/resources/templates/_macros/_tabs/licence.html.j2:
    • create a new macro calling the licence macro, named after a lower case version of the Licence enum item
  3. create a new test record in tests.resources.records.item_cat_licence
  4. add test record to:
    • tests.resources.records.item_cat_collection_all
    • tests.resources.stores.fake_records_store.FakeRecordsStore._fake_records
  5. update the lantern_tests.templates.macros.test_tabs.TestLicenceTab.test_licence test

Adding site pages

Warning

This section is Work in Progress (WIP) and may not be complete/accurate.

  • ... include in lantern.verification.Verification.site_pages list
  • ... include in OpenAPI Definition

Updating styles

Important

Follow the Styling Guidelines when updating styles.

  1. make changes to src/lantern/resources/templates/_assets/css/main.css.j2
  2. apply classes as necessary to elements in HTML Templates
  3. run the css Development Task which will:
    • build a temporary Static Site using the Test Catalogue
    • run the Tailwind compiler against this site output, adding or removing classes based on usage
    • copy the resulting minified CSS to src/lantern/resources/css/main.css
  4. run the build-test-records or build-records Development Task to rebuild the static site
    • needed as builds reference a local copy of main.css that will need refreshing

Tip

You can run uv run task css && uv run task build-test-records to chain these tasks together when iterating changes.

Updating scripts

  1. make changes to src/lantern/resources/templates/_assets/js/*.js.j2 and/or Asset Macros
  2. if needed, make changes to HTML Templates and/or Common Macros
  3. run the build-test-records or build-records Development Task to rebuild the static site
    • needed to include variables from Site Metadata
    • needed as builds reference local copies of these dynamic generated scripts

Tip

You can run task js && task build-test-records to chain these tasks together when iterating changes.

Adding development tasks

See the Taskipy documentation.

Python version

The minimum Python version is 3.11 for consistency with related projects.

Dependencies

Vulnerability scanning

The uv audit command checks for known vulnerabilities in packages using the Python Packaging Advisory Database.

Warning

As with all security tools, uv audit is a tool for detecting well-known vulnerabilities, not a guarantee of secure dependencies. While uv audi is experimental, it's data source is not.

Checks are run automatically in Continuous Integration.

Tip

To check locally run the vulnerabilities Development Task.

Updating dependencies

Tip

If playwright is upgraded, run uv playwright install locally and update CI image to match new version.

To list all (direct and indirect) outdated dependencies, run uv tree --outdated.

Linting

Ty

Ty is used for static type checking in main application Python files (not tests, etc.). Default options are used. Type checks are run automatically in Continuous Integration and the Pre-Commit Hook.

Important

Ty is an experimental tool and includes many false positive findings.

Known-positive findings are frequently ignored, where it would require effort to fix, as a pragmatic/lax approach.

A stricter approach may be adopted as ty matures, and/or specific bugs are fixed.

Tip

To check types manually run the types Development Task.

Ruff

Ruff is used to lint and format Python files. Specific checks and config options are set in pyproject.toml. Linting checks are run automatically in Continuous Integration and the Pre-Commit Hook.

Tip

To check linting manually run the lint Development Task, for formatting run the format task.

Static security analysis

Ruff is configured to run Bandit, a static analysis tool for Python.

Warning

As with all security tools, Bandit is an aid for spotting common mistakes, not a guarantee of secure code. In particular this tool can't check for issues that are only detectable when running code.

Markdown

PyMarkdown is used to lint Markdown files. Specific checks and config options are set in pyproject.toml. Linting checks are run automatically in Continuous Integration and the Pre-Commit Hook.

Tip

To check linting manually run the markdown Development Task.

Wide tables will fail rule MD013 (max line length). Wrap such tables with pragma disable/enable exceptions:

<!-- pyml disable md013 -->
| Header | Header |
|--------|--------|
| Value  | Value  |
<!-- pyml enable md013 -->

Stacked admonitions will fail rule MD028 (blank lines in blockquote) as it's ambiguous whether a new blockquote has started where another element isn't in between. Wrap such instances with pragma disable/enable exceptions:

<!-- pyml disable md028 -->
> [!NOTE]
> ...

> [!NOTE]
> ...
<!-- pyml enable md028 -->

Editorconfig

For consistency, it's strongly recommended to configure your IDE or other editor to use the EditorConfig settings defined in .editorconfig.

Pre-commit hook

A Pre-Commit hook is configured in .pre-commit-config.yaml.

To update Pre-Commit and configured hooks:

% pre-commit autoupdate

Tip

To run pre-commit checks against all files manually run the pre-commit Development Task.

Testing

Pytest

pytest with a number of plugins is used for testing the application. Config options are set in pyproject.toml. Tests are defined in the tests package.

Note

Parallel processing within code is disabled in tests to avoid issues with HTTP recording, (by setting the PARALLEL_JOBS config option).

Tests themselves are run in parallel using pytest-xdist.

Tests are run automatically in Continuous Integration.

Tip

To run tests manually run the test Development Task.

Tip

To run a specific test:

% uv run pytest tests/path/to/test_module.py::<class>.<method>

Pytest fast fail

If a test run fails with a NotImplementedError exception run the test-reset Development Task.

This occurs where:

  • a test fails and the failed test is then renamed or parameterised options changed
  • the reference to the previously failed test has been cached to enable the --failed-first runtime option
  • the cached reference no longer exists triggering an error which isn't handled by the pytest-random-order plugin

Running this task clears Pytest's cache and re-runs all tests, skipping the --failed-first option.

Pytest fixtures

Fixtures SHOULD be defined in tests.conftest prefixed with fx_ to indicate they are a fixture when used in tests. E.g.:

import pytest

@pytest.fixture()
def fx_foo() -> str:
    """Example test fixture."""
    return 'foo'

Pytest-cov test coverage

pytest-cov checks test coverage. We aim for 100% coverage but exemptions are fine with good justification:

  • # pragma: no cover - for general exemptions
  • # pragma: no branch - where a conditional branch can never be called

Continuous Integration will check coverage automatically.

Tip

To check coverage manually run the test-cov Development Task.

Tip

To run tests for a specific module locally:

% uv run pytest --cov=lantern.some.module --cov-report=html tests/lantern_tests/some/module

Where tests are added to ensure coverage, use the cov mark, e.g:

import pytest

@pytest.mark.cov()
def test_foo():
    assert 'foo' == 'foo'

Pytest-xdist

pytest-xdist runs tests in parallel to speed up test runs.

Note

Tests that rely on the fx_exporter_static_server fixture (such as Playwright tests) need to run on the same worker. This is set using the e2e xdist group (i.e. @pytest.mark.xdist_group("e2e")).

Pytest-env

pytest-env sets environment variables used by the Config class to fake values when testing. Values are configured in the [tool.pytest_env] section of pyproject.toml.

Pytest-recording

pytest-recording is used to mock HTTP calls to provider APIs (ensuring known values are used in tests).

Caution

Review recorded responses to check for any sensitive information.

To update a specific test:

% uv run pytest -n 0 --record-mode=once tests/path/to/test_module.py::<class>::<method>
# E.g.
% uv run pytest -n 0 --record-mode=once tests/lantern_tests/stores/test_gitlab_store.py::TestGitLabLocalCache::test_fetch_file_commits

To incrementally build up a set of related tests (including parameterised tests) use the new_episodes recording mode:

% uv run pytest -n 0 --record-mode=new_episodes tests/path/to/test_module.py::<class>::<method>

Static site template tests

Pytest parameterised tests with BeautifulSoup are used to check expected content is returned for each variant of Static Site Templates, e.g. with and without an optional property.

Playwright tests

Playwright Python tests are used to verify the behaviour of dynamic JavaScript content, such as switching tabs in items and opening/closing the feedback widget.

To run a specific test file with visible output:

% uv run pytest --headed tests/lantern_tests/e2e/test_item_e2e.py

Playwright tests require a real website to test against, which is provided by the fx_exporter_static_server fixture. This hosts a local Static Site served from a temporary directory using Python's simple HTTP server. The site is built by the fx_exporter_static_site fixture and contains all Test Records.

Note

This local server cannot be used directly in CI. Instead, a Python simple server serving a known (initially empty) path in the build directory is started before Pytest runs. The fx_exporter_static_server detects the CI environment and copies the static site build to this path, then quits, giving an equivalent outcome.

Note

Make sure Playwright tests use the e2e xdist group (i.e. @pytest.mark.xdist_group("e2e")) to avoid test failures.

Test catalogue

tests.resources.catalogues.fake_catalogue.FakeCatalogue

To aid in debugging and testing, a simple Catalogue is provided using a Test Records Store, exporting to a local directory.

Test records

tests.resources.records

To aid in debugging and testing, a set of fake records are included for:

  • example collections and products with only minimal properties set
  • example collections and products with all optional properties set
  • example items to test supported formatting options in free-text properties
  • example items to test supported distribution options and verification types
  • example items for each supported licence
  • examples of special items, such as physical maps

These records are used within tests but CAN and SHOULD also be used when developing Templates.

Test records store

tests.resources.stores.fake_records_store.FakeRecordsStore

An in-memory Store is provided to load these records for use with Exporters.

Test records signing keys

Test keys from the BAS Metadata Library are used for signing and encrypting Administrative Metadata within test records.

An additional X_ADMIN_METADATA_SIGNING_KEY_PRIVATE environment variable is set to load the private signing key for signing admin metadata instances.

Tip

In Development Tasks, the tasks._config.ExtraConfig class includes a ADMIN_METADATA_KEYS_RW property returning an AdministrationKeys instance with this key loaded.

Adding new test records

Warning

This section is Work in Progress (WIP) and may not be complete/accurate.

  1. create new tests/resources/records/item_cat_*.py file or clone from minimal examples
    • records MUST use a unique file_identifier
    • the tests.resources.records.utils.make_record() method SHOULD be used as a base (properties can be unset later)
  2. include the record in the tests.resources.records.item_cat_collection_all.collection_members list
  3. include the record in the resources.stores.fake_records_store.FakeRecordsStore._fake_records() method

If adding a New Item Type, you SHOULD create a minimum (_min) and full (_all) test record.

Tip

Run the build-test-records Development Task to export a static site using these records.

Run the serve task to host an exported static site, with real or test records, locally.

Test stores

Test GitLab local cache

To aid in debugging and testing a GitLab local cache, a backing SQLite database representing a minimally populated cache is available from tests/resources/stores/gitlab_cache/cache.db.

This database is populated independently of the GitLabLocalCache's implementation, but uses an aligned structure. It contains a single record, with a simplified file identifier and Git commit ID.

Tip

Run the build-test-cache Development Task to recreate the test cache database.

Continuous Integration

All commits will trigger Continuous Integration using GitLab's CI/CD platform, configured in .gitlab-ci.yml.