Skip to content

Commit 832c95f

Browse files
docs: hidden attributes are platform-only; users should not declare them
Updated to reflect the design decision in datajoint/datajoint-python#1441: the parser keeps rejecting leading-underscore attribute names and now returns a clear DataJointError instead of a cryptic ParseException. Reframe §3.4 around the platform-managed-only intent: - Lead paragraph states up-front that user-defined hidden attributes are not supported, and shows the new error message users will see. - Drop the "User-defined hidden attributes" subsection and the _params_hash hidden example. - Keep the platform-attributes table and the behavior matrix — both are still useful for users encountering platform-managed hidden columns (_job_start_time, etc.) in fetch results, joins, and describe output. - Add an explanation paragraph ("Why users can't declare them") covering the no-write-path / no-round-trip / silent-filter rationale. - Replace the user-defined example with a regular-attribute example (params_hash backing a unique index), demonstrating the recommended pattern: declare as a regular attribute, use proj() at the call site for visibility control.
1 parent cacff63 commit 832c95f

1 file changed

Lines changed: 37 additions & 25 deletions

File tree

src/reference/specs/table-declaration.md

Lines changed: 37 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -158,9 +158,16 @@ attribute_name [= default_value] : type [# comment]
158158

159159
### 3.4 Hidden Attributes
160160

161-
Attributes with names starting with an underscore (`_`) are **hidden**. The hidden-attribute mechanism was designed for platform operations — bookkeeping columns DataJoint itself adds to support the data pipeline — and is filtered out of normal user-facing query results. Some hidden-attribute functionality is exposed to users as well, but the feature is not intended as a general column-hiding tool.
161+
Attributes with names starting with an underscore (`_`) are **hidden**. The hidden-attribute mechanism is reserved for **platform-managed** columns — bookkeeping that DataJoint itself adds to support the data pipeline — and is intentionally not exposed for user-defined attributes. Attempting to declare an attribute name with a leading underscore raises:
162162

163-
**Platform-managed hidden attributes** are added automatically when DataJoint declares certain table types. Users do not write these in the definition:
163+
```text
164+
DataJointError: Attribute name in line "_hidden: bool" starts with an underscore.
165+
Names with leading underscore are reserved for platform-managed columns
166+
(e.g. _job_start_time, _singleton). Use a regular attribute name; if you
167+
need to control visibility at the call site, use proj().
168+
```
169+
170+
**Platform-managed hidden attributes** are added automatically when DataJoint declares certain table types. Users do not write these in the definition; the framework injects them programmatically after parsing.
164171

165172
| Hidden attribute | Added to | Purpose |
166173
|------------------|----------|---------|
@@ -169,22 +176,9 @@ Attributes with names starting with an underscore (`_`) are **hidden**. The hidd
169176
| `_job_version` | `Computed`, `Imported` | Library version that produced the row |
170177
| `_singleton` | Singleton tables | Implementation detail of the singleton pattern |
171178

172-
**User-defined hidden attributes.** A definition may also declare hidden attributes directly. The most common use case is storing a derived value (for example, a hash of a JSON column) that backs a unique index but should not appear in query results:
173-
174-
```python
175-
@schema
176-
class TaskParams(dj.Manual):
177-
definition = """
178-
task_id : int32
179-
---
180-
tool : varchar(32)
181-
params : json
182-
_params_hash : varchar(32)
183-
unique index (tool, _params_hash)
184-
"""
185-
```
179+
These columns are populated by DataJoint internals via raw SQL during the `populate()` lifecycle, not via `insert`/`update1`. They are filtered out of every public API surface so they don't clutter joins, fetches, or displays.
186180

187-
**Behavior.** Hidden attributes are filtered out of nearly every user-facing surface. The filter is implemented in `Heading.attributes`, which all visible code paths consume; raw SQL strings bypass it.
181+
**Behavior.** The filter is implemented in `Heading.attributes`, which all visible code paths consume; raw SQL strings bypass it.
188182

189183
| Context | Hidden attributes |
190184
|---------|-------------------|
@@ -197,16 +191,14 @@ class TaskParams(dj.Manual):
197191
| Natural-join namesake matching | Excluded |
198192
| Dict restriction `Table & {"_name": value}` | Silently ignored |
199193
| String restriction `Table & "_name = ..."` | Included (passes to SQL) |
200-
| `insert()`, `insert1()`, `update1()` | Rejected — see write caveat below |
194+
| `insert()`, `insert1()`, `update1()` | Rejected (`Field not in table heading`) |
201195
| `insert(..., ignore_extra_fields=True)` | Silently dropped (key not written) |
202-
| `describe()` / reverse-engineered definition | **Excluded** — see round-trip caveat below |
196+
| `describe()` / reverse-engineered definition | Excluded |
203197
| `unique index (..., _name)` | Allowed |
204198

205-
**Write caveat.** Neither `insert`/`insert1` nor `update1` accepts hidden attributes through the public API. `update1` raises `DataJointError: Attribute '_name' not found.` `insert` raises `Field '_name' not in table heading` — unless `ignore_extra_fields=True` is passed, in which case the hidden key is *silently dropped* and never written. There is currently no public-API path to populate a user-defined hidden column. Platform-managed hidden columns (the `_job_*` group) are populated by DataJoint internals via raw SQL during the `populate()` lifecycle (see `autopopulate.py`), not via the user-facing `insert`/`update1` methods. If you declare a user-defined hidden column today and need to populate it, you must do so via `connection.query()` with a raw `INSERT` or `UPDATE`, or compute it from a non-hidden column inside an `auto_populate` step.
206-
207-
**Round-trip caveat.** `describe()` walks `heading.attributes`, so it omits hidden attributes from the regenerated definition. For platform-managed hidden columns this is harmless: re-declaring a `Computed` or `Imported` table re-injects `_job_*` automatically. For *user-defined* hidden columns (such as `_params_hash` above), the regenerated definition is incomplete — re-applying it would create a table without the hidden column. Treat `describe()` output as a starting point for review, not as a faithful round-trip when user-defined hidden columns are present.
199+
**Why users can't declare them.** Allowing user-defined hidden attributes would expose a feature with no public-API write path (`insert`/`update1` reject the keys; `ignore_extra_fields=True` drops them silently), no `describe()` round-trip (the regenerated definition would be missing the column), and silent filtering on dict restrictions. The cases users typically reach for hidden attributes — most commonly an index-backing derived column — are better served by a regular attribute.
208200

209-
**Accessing hidden attributes:**
201+
**Inspecting platform-managed hidden columns:**
210202

211203
```python
212204
# Default fetch — hidden columns excluded
@@ -225,9 +217,29 @@ MyTable & "_job_start_time > '2024-01-01'"
225217
MyTable & {'_job_start_time': some_date} # ⚠ ignored
226218
```
227219

228-
**When to declare a hidden attribute.** The bar is high. Reach for the `_` prefix only when the column is purely a platform/implementation concern that application code never reads, writes, or references — for example, `_job_start_time` (populated by `populate()` lifecycle internals), `_singleton` (an implementation detail of the singleton pattern), or a field whose values would actively interfere with natural-join semantics if visible.
220+
**Use a regular attribute instead.** When you want a column that's part of the schema-level contract (backing an index, storing a derived value, etc.) but isn't featured in default displays, declare it as a regular attribute and use `proj()` at the call site if you want to omit it from a particular query result. For example, a hash column backing a unique index:
229221

230-
If your application code computes the column, inserts it, queries on it, or wants to see it in `describe()` output, **declare it as a regular attribute** even when you don't want it featured prominently. Backing a unique index, on its own, is not a sufficient reason to hide a column — for example, a `params_hash` column that backs `unique index (tool, params_hash)` should be a regular attribute because the application code is the one computing and inserting the hash. Hiding it forfeits `insert1`, dict restrictions, and `describe()` round-trip without buying anything you couldn't get from `proj()` at the call site for visibility control.
222+
```python
223+
@schema
224+
class TaskParams(dj.Manual):
225+
definition = """
226+
task_id : int32
227+
---
228+
tool : varchar(32)
229+
params : json
230+
params_hash : varchar(32)
231+
unique index (tool, params_hash)
232+
"""
233+
234+
# Inserts work directly:
235+
TaskParams.insert1({'task_id': 1, 'tool': 't', 'params': {...}, 'params_hash': h})
236+
237+
# Dict restrictions work:
238+
TaskParams & {'params_hash': h}
239+
240+
# Hide from a specific result set with proj() if needed:
241+
TaskParams.proj('tool', 'params').fetch()
242+
```
231243

232244
### 3.5 Examples
233245

0 commit comments

Comments
 (0)