feat: add Jira Service Management (JSM) connector#9816
feat: add Jira Service Management (JSM) connector#9816zax0rz wants to merge 4 commits intoonyx-dot-app:mainfrom
Conversation
There was a problem hiding this comment.
2 issues found across 6 files
Confidence score: 4/5
- This PR appears safe to merge with minimal risk: the noted issues are low-to-moderate severity, and the highest-severity item has very low confidence, which lowers certainty of real-world impact.
- The main functional concern is in
backend/onyx/connectors/jira_service_management/connector.py, where incremental JQL may ignore theendbound (updated >= startonly), which could cause wider-than-intended reprocessing windows. - Also in
backend/onyx/connectors/jira_service_management/connector.py, swallowing exceptions during JSM comment extraction without logging can hide extraction failures and silently drop comment content. - Pay close attention to
backend/onyx/connectors/jira_service_management/connector.py- incremental window bounds and comment-extraction error visibility need verification.
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/onyx/connectors/jira_service_management/connector.py">
<violation number="1" location="backend/onyx/connectors/jira_service_management/connector.py:59">
P2: Exceptions during JSM comment extraction are swallowed without logging, which can silently drop comment content and make extraction failures invisible to operators.</violation>
<violation number="2" location="backend/onyx/connectors/jira_service_management/connector.py:154">
P2: Incremental JQL ignores the `end` parameter. `load_from_checkpoint` accepts `end` but the JQL only filters on `updated >= start` with no upper bound, so incremental windows aren’t honored and can reprocess updates beyond the intended range.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| comment_strs = [] | ||
| try: | ||
| comment_strs = get_comment_strs(issue) | ||
| except Exception: |
There was a problem hiding this comment.
P2: Exceptions during JSM comment extraction are swallowed without logging, which can silently drop comment content and make extraction failures invisible to operators.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/onyx/connectors/jira_service_management/connector.py, line 59:
<comment>Exceptions during JSM comment extraction are swallowed without logging, which can silently drop comment content and make extraction failures invisible to operators.</comment>
<file context>
@@ -0,0 +1,212 @@
+ comment_strs = []
+ try:
+ comment_strs = get_comment_strs(issue)
+ except Exception:
+ pass
+
</file context>
| end: SecondsSinceUnixEpoch, | ||
| checkpoint: JiraServiceManagementCheckpoint, | ||
| ) -> CheckpointOutput[JiraServiceManagementCheckpoint]: | ||
| jql = self._build_jql(start) |
There was a problem hiding this comment.
P2: Incremental JQL ignores the end parameter. load_from_checkpoint accepts end but the JQL only filters on updated >= start with no upper bound, so incremental windows aren’t honored and can reprocess updates beyond the intended range.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At backend/onyx/connectors/jira_service_management/connector.py, line 154:
<comment>Incremental JQL ignores the `end` parameter. `load_from_checkpoint` accepts `end` but the JQL only filters on `updated >= start` with no upper bound, so incremental windows aren’t honored and can reprocess updates beyond the intended range.</comment>
<file context>
@@ -0,0 +1,212 @@
+ end: SecondsSinceUnixEpoch,
+ checkpoint: JiraServiceManagementCheckpoint,
+ ) -> CheckpointOutput[JiraServiceManagementCheckpoint]:
+ jql = self._build_jql(start)
+ current_start = checkpoint.start_at
+
</file context>
Greptile SummaryThis PR adds a new Jira Service Management (JSM) connector that indexes JSM tickets (Service Request, Incident, Problem, Change, Service Task) from a specified project using the Jira REST API v2. It correctly registers the new The implementation is clean and well-structured — it reuses the existing Jira utilities ( Remaining issues:
Confidence Score: 4/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant Runner as Connector Runner
participant JSM as JiraServiceManagementConnector
participant API as Jira REST API
Runner->>JSM: load_from_checkpoint(start, end, checkpoint)
JSM->>JSM: _build_jql(start)
note over JSM: JQL: project=X AND issuetype in (...) AND updated >= start
loop Paginate via checkpoint.start_at
JSM->>API: search_issues(jql, startAt, maxResults=batch_size)
alt API error
API-->>JSM: Exception
JSM->>Runner: yield ConnectorFailure(EntityFailure)
JSM->>Runner: return Checkpoint(has_more=False)
else Empty page
API-->>JSM: []
JSM->>Runner: return Checkpoint(has_more=False)
else Issues returned
API-->>JSM: [Issue, ...]
loop Per issue
JSM->>JSM: _process_jsm_issue(issue)
alt Process success
JSM->>Runner: yield Document
else Process error
JSM->>Runner: yield ConnectorFailure(DocumentFailure)
end
end
alt fetched < batch_size (last page)
JSM->>Runner: return Checkpoint(has_more=False)
else Full page (more pages)
JSM->>Runner: yield Checkpoint(has_more=True, start_at=next)
end
end
end
Prompt To Fix All With AIThis is a comment left during a code review.
Path: backend/tests/daily/connectors/jira_service_management/test_jsm_basic.py
Line: 34
Comment:
**Stale hardcoded org-specific URL in assertion**
`_make_connector()` now correctly uses `os.environ.get("JIRA_BASE_URL", "https://example.atlassian.net")` as the default, but this assertion still hardcodes `"https://danswerai.atlassian.net"`. When `JIRA_BASE_URL` is not set in CI, `jsm_connector.jira_base` will be `"https://example.atlassian.net"` and this assertion will always fail.
```suggestion
assert jsm_connector.jira_base == os.environ.get("JIRA_BASE_URL", "https://example.atlassian.net").rstrip("/")
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: backend/onyx/connectors/jira_service_management/connector.py
Line: 56-60
Comment:
**Silently swallowed comment fetch exception**
The `except Exception: pass` block discards errors from `get_comment_strs` without any log line. Per the project's best practices ("fail loudly instead of silently swallowing errors"), at minimum a warning should be logged so that unexpected failures are visible during debugging and monitoring.
```suggestion
try:
comment_strs = get_comment_strs(issue)
except Exception as e:
logger.warning(f"Failed to fetch comments for issue {issue.key}: {e}")
```
**Context Used:** contributing_guides/best_practices.md ([source](https://app.greptile.com/review/custom-context?memory=a870af60-c5ac-4007-aa2c-f5d2a62f9725))
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: backend/onyx/connectors/jira_service_management/connector.py
Line: 148-154
Comment:
**`end` parameter is never used**
`end: SecondsSinceUnixEpoch` is accepted but never applied to the JQL. As a result, `_build_jql` only adds a lower bound (`updated >= start`) with no upper bound (`updated <= end`). During an incremental sync window, issues updated *after* `end` will be picked up and indexed, potentially causing duplicate processing with the next sync run.
Consider adding the upper bound:
```python
if start is not None and start > 0:
end_str = datetime.fromtimestamp(end, tz=timezone.utc).strftime("%Y-%m-%d %H:%M")
return f'{base_jql} AND updated >= "{start_str}" AND updated <= "{end_str}"'
```
If intentionally omitted (e.g., to match the existing Jira connector's behaviour), a brief comment noting the reason would help future readers.
How can I resolve this? If you propose a fix, please make it concise.Reviews (2): Last reviewed commit: "fix: use self.batch_size instead of hard..." | Re-trigger Greptile |
| except Exception as e: | ||
| yield ConnectorFailure( | ||
| failed_document=None, | ||
| failure_message=f"Failed to fetch JSM issues: {e}", | ||
| exception=e, | ||
| ) | ||
| return JiraServiceManagementCheckpoint( | ||
| start_at=current_start, has_more=False |
There was a problem hiding this comment.
ConnectorFailure model validation crash on page-level errors
ConnectorFailure has a Pydantic mode="before" validator that requires exactly one of failed_document or failed_entity to be non-None:
if (failed_document is None and failed_entity is None) or (
failed_document is not None and failed_entity is not None
):
raise ValueError("Exactly one of 'failed_document' or 'failed_entity' must be specified.")Passing failed_document=None (while failed_entity also defaults to None) will always raise a ValueError from inside the Pydantic constructor. Because this ValueError is raised before the yield, the generator terminates with an unexpected exception instead of gracefully reporting the failure.
Use failed_entity with an EntityFailure to represent a non-document-specific failure:
| except Exception as e: | |
| yield ConnectorFailure( | |
| failed_document=None, | |
| failure_message=f"Failed to fetch JSM issues: {e}", | |
| exception=e, | |
| ) | |
| return JiraServiceManagementCheckpoint( | |
| start_at=current_start, has_more=False | |
| yield ConnectorFailure( | |
| failed_entity=EntityFailure( | |
| entity_id=f"jql:{jql}:page:{current_start}", | |
| ), | |
| failure_message=f"Failed to fetch JSM issues: {e}", | |
| exception=e, | |
| ) |
You'll also need to import EntityFailure from onyx.connectors.models.
Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/onyx/connectors/jira_service_management/connector.py
Line: 165-172
Comment:
**`ConnectorFailure` model validation crash on page-level errors**
`ConnectorFailure` has a Pydantic `mode="before"` validator that requires **exactly one** of `failed_document` or `failed_entity` to be non-`None`:
```python
if (failed_document is None and failed_entity is None) or (
failed_document is not None and failed_entity is not None
):
raise ValueError("Exactly one of 'failed_document' or 'failed_entity' must be specified.")
```
Passing `failed_document=None` (while `failed_entity` also defaults to `None`) will always raise a `ValueError` from inside the Pydantic constructor. Because this `ValueError` is raised *before* the `yield`, the generator terminates with an unexpected exception instead of gracefully reporting the failure.
Use `failed_entity` with an `EntityFailure` to represent a non-document-specific failure:
```suggestion
yield ConnectorFailure(
failed_entity=EntityFailure(
entity_id=f"jql:{jql}:page:{current_start}",
),
failure_message=f"Failed to fetch JSM issues: {e}",
exception=e,
)
```
You'll also need to import `EntityFailure` from `onyx.connectors.models`.
How can I resolve this? If you propose a fix, please make it concise.| results = list(jsm_connector.load_from_checkpoint(start=0, end=100, checkpoint=checkpoint)) | ||
|
|
||
| # Should return final checkpoint with has_more=False | ||
| assert len(results) == 1 | ||
| final_checkpoint = results[0] | ||
| assert isinstance(final_checkpoint, type(checkpoint)) | ||
| assert final_checkpoint.has_more is False |
There was a problem hiding this comment.
Test incorrectly expects generator
return value in list() output
load_from_checkpoint is a generator (CheckpointOutput). Its terminal state is communicated via a return <checkpoint> statement, which becomes the generator's StopIteration.value — it is not part of the sequence of yielded items.
When mock_jira_client.search_issues returns [], the generator hits if not issues: return JiraServiceManagementCheckpoint(...) without yielding anything. list(...) collects only yielded values, so results will be [] and assert len(results) == 1 will fail.
To correctly capture the final checkpoint, drain the generator manually:
gen = jsm_connector.load_from_checkpoint(start=0, end=100, checkpoint=checkpoint)
yielded = []
try:
while True:
yielded.append(next(gen))
except StopIteration as exc:
final_checkpoint = exc.value
assert len(yielded) == 0
assert isinstance(final_checkpoint, JiraServiceManagementCheckpoint)
assert final_checkpoint.has_more is FalsePrompt To Fix With AI
This is a comment left during a code review.
Path: backend/tests/daily/connectors/jira_service_management/test_jsm_basic.py
Line: 61-67
Comment:
**Test incorrectly expects generator `return` value in `list()` output**
`load_from_checkpoint` is a generator (`CheckpointOutput`). Its terminal state is communicated via a `return <checkpoint>` statement, which becomes the generator's `StopIteration.value` — it is **not** part of the sequence of yielded items.
When `mock_jira_client.search_issues` returns `[]`, the generator hits `if not issues: return JiraServiceManagementCheckpoint(...)` without yielding anything. `list(...)` collects only yielded values, so `results` will be `[]` and `assert len(results) == 1` will fail.
To correctly capture the final checkpoint, drain the generator manually:
```python
gen = jsm_connector.load_from_checkpoint(start=0, end=100, checkpoint=checkpoint)
yielded = []
try:
while True:
yielded.append(next(gen))
except StopIteration as exc:
final_checkpoint = exc.value
assert len(yielded) == 0
assert isinstance(final_checkpoint, JiraServiceManagementCheckpoint)
assert final_checkpoint.has_more is False
```
How can I resolve this? If you propose a fix, please make it concise.|
|
||
| def _make_connector() -> JiraServiceManagementConnector: | ||
| connector = JiraServiceManagementConnector( | ||
| jira_base_url="https://danswerai.atlassian.net", |
There was a problem hiding this comment.
Hardcoded organization-specific URL
https://danswerai.atlassian.net is a real company's Atlassian URL. Daily/integration tests should use a configurable URL read from an environment variable (e.g. JIRA_BASE_URL) with a clearly fake fallback, consistent with how JIRA_USER_EMAIL and JIRA_API_TOKEN are handled below.
| jira_base_url="https://danswerai.atlassian.net", | |
| jira_base_url=os.environ.get("JIRA_BASE_URL", "https://example.atlassian.net"), |
Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/tests/daily/connectors/jira_service_management/test_jsm_basic.py
Line: 14
Comment:
**Hardcoded organization-specific URL**
`https://danswerai.atlassian.net` is a real company's Atlassian URL. Daily/integration tests should use a configurable URL read from an environment variable (e.g. `JIRA_BASE_URL`) with a clearly fake fallback, consistent with how `JIRA_USER_EMAIL` and `JIRA_API_TOKEN` are handled below.
```suggestion
jira_base_url=os.environ.get("JIRA_BASE_URL", "https://example.atlassian.net"),
```
How can I resolve this? If you propose a fix, please make it concise.| batch_size: int = INDEX_BATCH_SIZE, | ||
| ) -> None: | ||
| self.jira_base = jira_base_url.rstrip("/") | ||
| self.project_key = project_key | ||
| self.batch_size = batch_size |
There was a problem hiding this comment.
batch_size stored but never used
self.batch_size is assigned in __init__ but the code always passes the module-level constant _JSM_PAGE_SIZE (hard-coded 50) to search_issues. This makes the constructor parameter misleading — callers who set batch_size will see it silently ignored.
Either use self.batch_size as the maxResults value in the search_issues call, or remove the batch_size parameter entirely to match the actual behaviour.
Prompt To Fix With AI
This is a comment left during a code review.
Path: backend/onyx/connectors/jira_service_management/connector.py
Line: 116-120
Comment:
**`batch_size` stored but never used**
`self.batch_size` is assigned in `__init__` but the code always passes the module-level constant `_JSM_PAGE_SIZE` (hard-coded `50`) to `search_issues`. This makes the constructor parameter misleading — callers who set `batch_size` will see it silently ignored.
Either use `self.batch_size` as the `maxResults` value in the `search_issues` call, or remove the `batch_size` parameter entirely to match the actual behaviour.
How can I resolve this? If you propose a fix, please make it concise.Adds a new connector for Jira Service Management (JSM) that indexes tickets (Service Requests, Incidents, Problems, Changes, Service Tasks) from a specified JSM project using the Jira REST API v2. Closes onyx-dot-app#2281 Changes: - New connector: backend/onyx/connectors/jira_service_management/connector.py - CheckpointedConnector with proper JiraServiceManagementCheckpoint model - JQL-based incremental sync with updated >= filter for polling - Reuses build_jira_client and utility functions from existing Jira connector - Correct base class (CheckpointedConnector, not CheckpointConnector) - Proper Pydantic checkpoint model (not dict-based) - Add JIRA_SERVICE_MANAGEMENT to DocumentSource enum in constants.py - Register connector in registry.py - Add source metadata to web/src/lib/sources.ts - Add tests in backend/tests/daily/connectors/jira_service_management/
- Use EntityFailure instead of failed_document=None in page-level error handling (ConnectorFailure Pydantic validator requires exactly one of failed_document/failed_entity) - Import EntityFailure from onyx.connectors.models - Fix test: generator return values are not captured by list(), update assertion accordingly
- Use EntityFailure for page-level errors (ConnectorFailure validation fix) - Fix test: drain generator manually to capture StopIteration.value - Replace hardcoded danswerai.atlassian.net with JIRA_BASE_URL env var - Import JiraServiceManagementCheckpoint in test file
Removes the misleading dead constant and wires batch_size through to maxResults in search_issues as intended.
46f1b2d to
b8aa9db
Compare
Summary
Adds a new Jira Service Management (JSM) connector that indexes tickets (Service Requests, Incidents, Problems, Changes, Service Tasks) from a specified JSM project using the Jira REST API v2.
Closes #2281
Changes
backend/onyx/connectors/jira_service_management/connector.pyCheckpointedConnectorsubclass with properJiraServiceManagementCheckpoint(ConnectorCheckpoint)Pydantic modelupdated >=filter for efficient pollingbuild_jira_clientand utility functions from the existing Jira connectorConnectorFailurefor per-issue failuresbackend/onyx/configs/constants.py: AddedJIRA_SERVICE_MANAGEMENT = "jira_service_management"toDocumentSourceenumbackend/onyx/connectors/registry.py: RegisteredJiraServiceManagementConnectorin `CONNECTOR_CLASS_MAP``web/src/lib/sources.ts: Added source metadata entry withJiraIconand displayName "Jira Service Management"backend/tests/daily/connectors/jira_service_management/test_jsm_basic.pyCredentials required
jira_base_urlhttps://yourcompany.atlassian.netproject_keyITorHELPDESKjira_user_emailjira_api_tokenIntegration test
Tested live against a real JSM instance with 4 tickets across issue types:
All documents indexed with correct
source=DocumentSource.JIRA_SERVICE_MANAGEMENT, metadata (issue_key, status, priority, issue_type), and incremental sync working correctly.Notes on previous attempt (PR #9673)
The previous PR was rejected for these specific issues — all fixed here:
CheckpointConnector(doesn't exist) -> fixed toCheckpointedConnectorCredentialedConnector(doesn't exist) -> removed,load_credentialsonBaseConnectorConnectorCheckpointwith.get()-> proper PydanticJiraServiceManagementCheckpointmodelDocumentSource.JIRA_SERVICE_MANAGEMENTmissing from enum -> added toconstants.pySummary by cubic
Adds a Jira Service Management (JSM) connector to index project tickets with incremental sync and paginated fetching. Registers the source, exposes it in the UI, and includes basic tests.
New Features
JiraServiceManagementConnectorwithJiraServiceManagementCheckpointto index JSM tickets via the Jira REST API.updated >=incremental sync andbatch_sizepagination.build_jira_client,extract_text_from_adf,get_comment_strs), setsDocumentSource.JIRA_SERVICE_MANAGEMENT, registers inbackend/onyx/connectors/registry.py, and addsjira_service_managementUI metadata inweb/src/lib/sources.tswithJiraIcon.Bug Fixes
EntityFailureto satisfyConnectorFailurevalidation.StopIteration.valueand read the Jira base URL fromJIRA_BASE_URL.self.batch_sizeformaxResults.Written for commit b8aa9db. Summary will update on new commits.