Skip to content

Codereview#4

Open
epifanio wants to merge 10 commits intodevfrom
codereview
Open

Codereview#4
epifanio wants to merge 10 commits intodevfrom
codereview

Conversation

@epifanio
Copy link
Copy Markdown
Collaborator

codereview: robustness, security fixes, and axis control UX

Security

  • js_util.py: Replace Javascript class (which used eval() to execute
    arbitrary JS) with Redirector, a safe ReactiveHTML component that uses
    window.location.assign() directly. No code is evaluated; the URL is
    reset to '' after the redirect fires to allow reuse.

Bug fixes

  • TSP/main.py – plottable-variable selection: Replace the fragile
    heuristic (coord-in-dims intersection) with get_plottable_vars() from
    utility.py. The old approach missed variables in DSG (Discrete Sampling
    Geometry) datasets where the observation dimension has no registered
    coordinate, and incorrectly included coordinate/metadata variables
    (latitude, longitude, bottomDepth, profileIdAll, QC flags, etc.).

  • TSP/main.py – dimension selector: Replace list(ds[result].indexes)
    with get_axis_candidates(), which scans named indexes, auxiliary
    1-D coordinates, and coordinate-like data variables (depth, pressure,
    etc.) while excluding spatial metadata (lat, lon, altitude, station,
    …) via a configurable AXIS_BLACKLIST. Previously the selector was
    populated with all data variables, making it unusable on real datasets.

  • TSP/main.py – dimension switching: Fix the non-timeseries branch of
    plot() where x was incorrectly set to var instead of dimension,
    causing hvplot to ignore the dimension selection entirely. Now always
    sets axis_arguments['x'] = dimension and removes any stale 'y' key.
    For DSG datasets where the dimension variable is not registered as a
    coordinate on the target variable, falls back to a Dataset-level
    hvplot.line(x=dimension, y=var) call using a minimal sub-dataset.

  • TSP/main.py – y-axis inversion: Guard ds[dimension].attrs lookup with
    if dimension in ds: to avoid KeyError when dimension is a raw
    dimension name (e.g. obs) with no associated variable.

  • TSP/main.py – on_var_select(): Fix crash where dimension_group.options
    was populated with list(ds[result].indexes) (result is a list, not a
    string key) instead of axis candidates for the selected variable. Rebuild
    axis candidates and reset dimension_group.value to the first option
    whenever the variable changes.

  • TSP/main.py – export resampling logic: Fix inverted boolean in
    export_selection() that caused the resampler flag to always be False
    even when a non-raw frequency was selected.

  • TSP/main.py – show/hide buttons: Fix show_hide_widget() so that
    opening the metadata panel closes the downloader and vice-versa, by
    accepting explicit hide= and reveal= arguments via functools.partial.

  • TSP/utility.py – validate_opendap(): Use a context manager
    (with xr.open_dataset(...) as ...) to avoid leaking file handles.

  • TSP/utility.py – on_session_destroyed(): Remove invalid
    del ds / del plot_widget calls that referenced variables not in
    scope, which would raise UnboundLocalError on every session teardown.

  • TSP/utility.py – load_data() ERDDAP branch: Replace three
    separate xr.open_dataset() calls (each opening a new connection)
    with a single context-managed open.

  • ncapp/app/main.py: Fix from http.client import HTTPException (wrong
    module; HTTPException is from FastAPI). Rename duplicate index()
    route handlers to tsp() and trj() to avoid silent handler shadowing.
    Remove duplicate FastAPI() instantiation.

New features

  • TSP/main.py – axis control: Add Invert Y-axis and Swap axes
    checkboxes to the toolbar. Invert Y-axis combines with the existing
    CF-attribute / dimension-name auto-detection (OR logic) so users can
    force inversion on any dataset. Swap axes applies invert_axes=True
    to the HoloViews plot for the classic oceanographic profile orientation
    (depth on y, variable on x). Both flags are propagated through all
    variable/dimension/frequency change callbacks so state is preserved
    across interactions.

  • TSP/utility.py: Extract and export get_plottable_vars(),
    get_axis_candidates(), is_plottable(), safe_check_var(), and
    AXIS_BLACKLIST so the variable- and axis-selection logic is reusable
    and testable in isolation.

  • TSP/utility.py: Add build_metadata_widget() returning a plain
    Div instead of a nested pn.Row/Column, reducing layout nesting
    and fixing the widget-width issue that caused metadata to overflow.

  • trj/main.py: Add metadata panel (dataset global attributes) to the
    trajectory viewer, wired through build_metadata_widget() from the
    shared utility module. Add on_load() callback to hide the metadata
    panel after the page finishes loading (workaround for Panel layout
    width initialisation). Switch layout from pn.Column to pn.GridBox
    inside pn.Row to keep the map and plot side-by-side with the
    metadata panel.

Cleanup / code quality

  • Replaced all print() debug statements in TSP/main.py with
    logger.info() / logger.warning() calls.
  • Removed dead safe_check() local closure (superseded by
    safe_check_var() in utility.py).
  • Removed a hardcoded example-data dict that was embedded in the
    export_selection() payload, causing exports to always target a fixed
    URL instead of the user's loaded dataset.
  • Removed a dead BootstrapTemplate block that was never reachable.
  • Replaced bare except: clauses with except Exception: throughout
    utility.py.
  • docker/entrypoint.sh: Update panel serve command to only mount
    /TSP and /TRJ (removing stale /trj, /OGC_client, /seaice/*,
    /seaicemod/* mounts that no longer exist); rename /TRJ title from
    "trj" to "Trajectory".
  • ncapp/app/utility.py: New file — Pydantic v2 models (FeatureType,
    FeatureTypeEnum, URLStr, OpendapURL) and guess_feature_type_from_data()
    helper extracted into a dedicated module for the FastAPI proxy service.

epifanio and others added 10 commits April 17, 2026 09:54
Critical fixes:
- ncapp/main.py: import HTTPException from fastapi (not http.client), rename
  duplicate route handlers index→tsp/trj, remove double FastAPI() instantiation
- TSP/utility.py: remove duplicate dict_to_html/dict_to_html_ul/get_download_link
  definitions and unreachable code after return
- TSP/utility.py: fix resource leaks in validate_opendap() and ERDAPP workaround
  (use context managers, open URL once instead of 4x)
- ncapp/utility.py: fix Pydantic validator using yield instead of return; fix
  double ds.close() on error path; use context manager for dataset
- TSP/js_util.py + TSP/main.py: replace eval()-based Javascript class with safe
  Redirector using window.location.assign() — eliminates JS injection vector

Code quality:
- TSP/main.py: extract FILL_VALUE = 9.96921e36 constant, replace 4 magic literals
- TSP/main.py: fix inverted resampling logic in export_selection()
- TSP/main.py: remove hardcoded test data dict from export_selection()
- TSP/utility.py: replace bare except: with except Exception:
- TSP/utility.py: remove double gc import
- TSP/utility.py: remove broken on_session_destroyed cleanup (tried to del locals)
- entrypoint.sh: remove --autoreload from production panel serve command
- All files: remove dead commented-out code (old layout variants, debug prints,
  unreachable blocks), replace print() calls with logger.info()

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The replace_all that substituted 9.96921e36 → FILL_VALUE also rewrote
the constant definition itself, producing `FILL_VALUE = FILL_VALUE`.
Restore the literal value in the definition.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The old approach filtered variables by checking whether their coordinates
intersected with ds.dims. This failed for DSG (Discrete Sampling Geometry)
datasets where the obs dimension is not a named coordinate:
- Profile datasets: temperature/salinity/DIC had dim 'obs' which was not in
  ds.coords, so they were excluded; profile-level metadata (time, lat, lon)
  passed instead.
- TimeSeriesProfile indexed-ragged: dims (profile/obs) had no overlap with
  coords (time/PP), so every variable was excluded.

New helpers in TSP/utility.py:

  get_plottable_vars(ds)
    - Iterates ds.data_vars (not ds, which includes coords)
    - Keeps numeric dtypes only (float/int, excludes string/object)
    - Skips coordinate-like names (lat, lon, time, depth, z, pressure, ...)
    - Skips QC/flag variables by suffix (_qc, _flag, woce, _err, ...)
    - Applies safe_check_var() for live access validation
    - featureType-agnostic: one code path for all CF feature types

  get_axis_candidates(ds, var_name)
    - Returns named dimension indexes first (proper dim coords)
    - Then scans all 1-D variables sharing the variable's dimensions for
      datetime or numeric dtype — picks up depth(obs), pressure(obs),
      time(obs) in profile/DSG data where obs has no registered coord
    - Falls back to raw dimension name so the selector is never empty

  safe_check_var(ds, var) extracted from main.py safe_check() closure

  _is_time_like(name) promoted to module level for reuse in on_var_select

main.py:
- Replace 4 identical featureType if-blocks with get_plottable_vars(ds)
- Remove coord-count guard (len(ds[i].coords) != 0) — redundant and wrong
  for vars whose coords aren't registered
- Replace ds[result].indexes loop in on_var_select and initial setup with
  get_axis_candidates() + time-first sort
- Remove local safe_check() closure

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ates

get_axis_candidates() was scanning both ds.coords and ds.data_vars for
any 1-D numeric variable sharing the plotted variable's dimension.  This
caused data variables (temperature, salinity, DIC, TT, FF, …) to appear
in the Dimension dropdown alongside genuine axis variables.

New logic keeps three strictly-ordered passes, each scoped to coordinate-
type names only:

  1. Named dimension indexes (proper xarray dim coords) — unchanged.
  2. Auxiliary coordinates in ds.coords sharing the dimension — covers
     latitude(time), longitude(time) and similar attached coords.
  3. Data variables in ds.data_vars whose lowercased name appears in
     _COORD_LIKE_NAMES (lat, lon, z, depth, pressure, time, …).  This
     handles DSG files where z/pressure land in data_vars because the
     file omits a 'coordinates' attribute.  Generic data variables
     (temperature, salinity, etc.) are NOT in _COORD_LIKE_NAMES and are
     therefore excluded.
  4. Raw dimension name fallback so the selector is never empty.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… selector

Two bugs triggered by the Profile 1 dataset:

1. latitude selected as dimension → DataError crash
   get_axis_candidates() was picking up latitude (a coord-like data_var
   with dim 'profile') as an axis candidate before AXIS_BLACKLIST was
   applied. Added AXIS_BLACKLIST (public frozenset, easy to extend) and
   apply it in all three passes of get_axis_candidates().
   Default blacklist: lat/lon, altitude, station, profile, trajectory,
   row_size. Time, depth, pressure are intentionally NOT blacklisted.

2. bottomDepth / profileidAll appearing in variable selector
   These are per-station scalar metadata (one value per profile instance,
   not per observation). Two additions to the exclusion rules in
   is_plottable():
   - _COORD_LIKE_NAMES gains bottomdepth / bottom_depth
   - _QC_SUFFIXES renamed to _EXCLUDE_SUFFIXES and extended with
     identifier suffixes: idall, _id, id — catches profileidAll and
     similar integer-index columns

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Dimension switching was broken by a wrong x/y axis assignment in plot():
the non-timeseries branch set x=var (e.g. x='soil_temperature') instead
of x=dimension.  HVplot does not recognise a DataArray's own name as an
axis specifier and falls back silently to its previous layout, so the
plot never changed when the user selected a different dimension.

Fixes in plot() non-timeseries branch:
- Always set x=dimension (the selected axis variable)
- Remove any stale 'y' key so hvplot uses DataArray values as y by default
- For DSG-style data where dimension is a separate 1-D variable not
  registered as a coord of var (e.g. depth(obs) alongside temperature(obs)):
  build a minimal sub-dataset and use the Dataset-level hvplot call so
  both arrays are accessible to hvplot
- Replace the three duplicated try/except TypeError/ValueError fallback
  blocks with a single except Exception that logs a warning and falls
  back to the default axis
- Guard ds[dimension].attrs access (was a KeyError when dimension was a
  raw dim name like 'obs' with no corresponding variable)
- Replace remaining print() with logger.warning/logger.info

Also committed in this batch:
- AXIS_BLACKLIST frozenset in utility.py: latitude, longitude, altitude,
  station, profile, trajectory, row_size excluded from the Dimension
  selector even when they are valid coordinate variables
- _COORD_LIKE_NAMES: add bottomdepth/bottom_depth (per-station scalar,
  not a per-observation measurement)
- _QC_SUFFIXES renamed _EXCLUDE_SUFFIXES, extended with idall/_id/id
  to exclude integer-identifier columns (profileidAll, etc.)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
## Security

- js_util.py: Replace `Javascript` class (which used `eval()` to execute
  arbitrary JS) with `Redirector`, a safe ReactiveHTML component that uses
  `window.location.assign()` directly. No code is evaluated; the URL is
  reset to '' after the redirect fires to allow reuse.

## Bug fixes

- TSP/main.py – plottable-variable selection: Replace the fragile
  heuristic (coord-in-dims intersection) with `get_plottable_vars()` from
  utility.py. The old approach missed variables in DSG (Discrete Sampling
  Geometry) datasets where the observation dimension has no registered
  coordinate, and incorrectly included coordinate/metadata variables
  (latitude, longitude, bottomDepth, profileIdAll, QC flags, etc.).

- TSP/main.py – dimension selector: Replace `list(ds[result].indexes)`
  with `get_axis_candidates()`, which scans named indexes, auxiliary
  1-D coordinates, and coordinate-like data variables (depth, pressure,
  etc.) while excluding spatial metadata (lat, lon, altitude, station,
  …) via a configurable `AXIS_BLACKLIST`. Previously the selector was
  populated with all data variables, making it unusable on real datasets.

- TSP/main.py – dimension switching: Fix the non-timeseries branch of
  `plot()` where `x` was incorrectly set to `var` instead of `dimension`,
  causing hvplot to ignore the dimension selection entirely. Now always
  sets `axis_arguments['x'] = dimension` and removes any stale `'y'` key.
  For DSG datasets where the dimension variable is not registered as a
  coordinate on the target variable, falls back to a Dataset-level
  `hvplot.line(x=dimension, y=var)` call using a minimal sub-dataset.

- TSP/main.py – y-axis inversion: Guard `ds[dimension].attrs` lookup with
  `if dimension in ds:` to avoid KeyError when `dimension` is a raw
  dimension name (e.g. `obs`) with no associated variable.

- TSP/main.py – `on_var_select()`: Fix crash where `dimension_group.options`
  was populated with `list(ds[result].indexes)` (result is a list, not a
  string key) instead of axis candidates for the selected variable. Rebuild
  axis candidates and reset `dimension_group.value` to the first option
  whenever the variable changes.

- TSP/main.py – export resampling logic: Fix inverted boolean in
  `export_selection()` that caused the resampler flag to always be False
  even when a non-raw frequency was selected.

- TSP/main.py – show/hide buttons: Fix `show_hide_widget()` so that
  opening the metadata panel closes the downloader and vice-versa, by
  accepting explicit `hide=` and `reveal=` arguments via `functools.partial`.

- TSP/utility.py – `validate_opendap()`: Use a context manager
  (`with xr.open_dataset(...) as ...`) to avoid leaking file handles.

- TSP/utility.py – `on_session_destroyed()`: Remove invalid
  `del ds` / `del plot_widget` calls that referenced variables not in
  scope, which would raise `UnboundLocalError` on every session teardown.

- TSP/utility.py – `load_data()` ERDDAP branch: Replace three
  separate `xr.open_dataset()` calls (each opening a new connection)
  with a single context-managed open.

- ncapp/app/main.py: Fix `from http.client import HTTPException` (wrong
  module; `HTTPException` is from FastAPI). Rename duplicate `index()`
  route handlers to `tsp()` and `trj()` to avoid silent handler shadowing.
  Remove duplicate `FastAPI()` instantiation.

## New features

- TSP/main.py – axis control: Add `Invert Y-axis` and `Swap axes`
  checkboxes to the toolbar. `Invert Y-axis` combines with the existing
  CF-attribute / dimension-name auto-detection (OR logic) so users can
  force inversion on any dataset. `Swap axes` applies `invert_axes=True`
  to the HoloViews plot for the classic oceanographic profile orientation
  (depth on y, variable on x). Both flags are propagated through all
  variable/dimension/frequency change callbacks so state is preserved
  across interactions.

- TSP/utility.py: Extract and export `get_plottable_vars()`,
  `get_axis_candidates()`, `is_plottable()`, `safe_check_var()`, and
  `AXIS_BLACKLIST` so the variable- and axis-selection logic is reusable
  and testable in isolation.

- TSP/utility.py: Add `build_metadata_widget()` returning a plain
  `Div` instead of a nested `pn.Row/Column`, reducing layout nesting
  and fixing the widget-width issue that caused metadata to overflow.

- trj/main.py: Add metadata panel (dataset global attributes) to the
  trajectory viewer, wired through `build_metadata_widget()` from the
  shared utility module. Add `on_load()` callback to hide the metadata
  panel after the page finishes loading (workaround for Panel layout
  width initialisation). Switch layout from `pn.Column` to `pn.GridBox`
  inside `pn.Row` to keep the map and plot side-by-side with the
  metadata panel.

## Cleanup / code quality

- Replaced all `print()` debug statements in TSP/main.py with
  `logger.info()` / `logger.warning()` calls.
- Removed dead `safe_check()` local closure (superseded by
  `safe_check_var()` in utility.py).
- Removed a hardcoded example-data dict that was embedded in the
  `export_selection()` payload, causing exports to always target a fixed
  URL instead of the user's loaded dataset.
- Removed a dead BootstrapTemplate block that was never reachable.
- Replaced bare `except:` clauses with `except Exception:` throughout
  utility.py.
- docker/entrypoint.sh: Update `panel serve` command to only mount
  `/TSP` and `/TRJ` (removing stale `/trj`, `/OGC_client`, `/seaice/*`,
  `/seaicemod/*` mounts that no longer exist); rename `/TRJ` title from
  `"trj"` to `"Trajectory"`.
- ncapp/app/utility.py: New file — Pydantic v2 models (`FeatureType`,
  `FeatureTypeEnum`, `URLStr`, `OpendapURL`) and `guess_feature_type_from_data()`
  helper extracted into a dedicated module for the FastAPI proxy service.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant