Skip to content

GLM 5.1 Together/FW, and Opus 4.7, Minimax 2.7 on together#1284

Merged
tawnymanticore merged 30 commits intoremote_configfrom
main
Apr 16, 2026
Merged

GLM 5.1 Together/FW, and Opus 4.7, Minimax 2.7 on together#1284
tawnymanticore merged 30 commits intoremote_configfrom
main

Conversation

@tawnymanticore
Copy link
Copy Markdown
Collaborator

What does this PR do?

GLM 5.1 Together/FW, and Opus 4.7, Minimax 2.7 on together

sfierro and others added 30 commits April 7, 2026 15:32
Addresses 11 items: add X button to dismiss questions, preserve answers on
failed request, add Created At to spec details, allow whitespace while typing
spec names (trim on submit), add priority selector in advanced options, fix
autoselect badge persistence, rename FewShotSelector to TaskSampleSelector,
fine tune page max-width, add Re-run button for review examples, disable
copilot when full trace enabled, and add archive/unarchive to spec details.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ssages

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Lift dismissed state to parent like selections/other_texts so dismissals
survive component remounts on API failures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Initialize model from ui_state store (localStorage) instead of empty
string so the previously selected model is restored on page load.
Also fix the saved-config dropdown to show "custom" immediately
instead of "Select an option" while configs load.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…intentional Custom selection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the single `user_feedback` string field on TaskRun with a proper
Feedback model that supports multiple feedback entries per run. Feedback
is a parented model under TaskRun, stored as separate files to avoid
write conflicts when multiple people provide feedback.

- Add Feedback model (feedback text + FeedbackSource enum)
- Make TaskRun a parent model with feedback children
- Remove user_feedback field from TaskRun
- Add REST API endpoints (list/create) for feedback on task runs
- Update copilot models, utils, and frontend spec builder
- Create follow-up ticket KIL-537 for repair UI replacement

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The ticket only asked to remove user_feedback from TaskRun, not rename
it in the copilot/spec-builder code which uses it for a different purpose.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When creating TaskRuns from reviewed examples in the copilot flow,
create Feedback children (with source=spec-feedback) after saving
the run, so review feedback is not lost.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
KIL-534 Add Feedback data model on TaskRun
KIL-522 Restore persisted model selection on Run page
Remove all repair UI code (repair form, repair edit form, repair
review/accept/delete flows) and replace with a new feedback UI that
uses the Feedback data model from KIL-534.

- Rename "Output Rating" to "Rating and Feedback"
- Add inline feedback list (up to 3, truncated) with "Add Feedback" link
- Add "All Feedback" modal with sortable table
- Add "Add Feedback" modal using FormContainer
- Delete output_repair_edit_form.svelte
- Remove model_name/provider/focus_repair_on_appear props from Run

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add request ID tracking and run ID dedup to load_feedback to prevent
  race conditions and redundant requests when switching runs
- Set add_feedback_submitting = true at start of submit_feedback

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
KIL-517 Fix misc spec builder bugs and improvements
KIL-537 Replace repair UI with feedback UI
* Add Grok 4.20 and Minimax M2.7 TogetherAI provider

Added Grok 4.20 (OpenRouter) and TogetherAI provider for Minimax M2.7 to the model list.

https://claude.ai/code/session_01S77zSCTFnNW52JiCyWpBoV

* Remove reasoning flags from Grok 4.20

Other Grok models on OpenRouter don't set reasoning_capable=True.
The model doesn't reliably return reasoning, causing 5 test failures.
Removing to match the Kiln pattern for Grok on OpenRouter.

https://claude.ai/code/session_01S77zSCTFnNW52JiCyWpBoV

* Fix Minimax M2.7 Together AI structured output config

The json_schema mode was being ignored by M2.7 on Together AI (model
returned plain text instead of JSON). Switch to json_instruction_and_object
with reasoning_optional_for_structured_output and optional_r1_thinking
parser, matching the M2.5 Together AI config that works reliably.

https://claude.ai/code/session_01F1L5ryuY5t2MxQXbNVjQGj

---------

Co-authored-by: Claude <noreply@anthropic.com>
…1281)

* Update SKILL.md

* Update SKILL.md

* Update SKILL.md

* CR
…ts (#1283)

* Update SKILL.md

* Update SKILL.md

* Update SKILL.md

* CR

* Update SKILL.md
* Add Claude Opus 4.7 to model list (anthropic, openrouter)

Adds Anthropic's new Opus 4.7 model with both Anthropic and OpenRouter
providers. Introduces CLAUDE_OPUS_4_7_ANTHROPIC_THINKING_LEVELS to
support the new "xhigh" and "max" effort levels exclusive to Opus 4.7.

* Apply zero-sum swap: demote Opus 4.6 from suggested/featured

Opus 4.7 now carries featured_rank=2, editorial_notes, suggested_for_evals,
and suggested_for_data_gen. Removing the same flags from Opus 4.6 keeps the
suggested/featured count stable across the Claude Opus family.

https://claude.ai/code/session_01Xnfzt91McoMdqaiRv1g6xg

* Add PDF support to OpenRouter provider for Opus 4.7

Adds KilnMimeType.PDF to multimodal_mime_types and sets
multimodal_requires_pdf_as_image=True (OpenRouter's PDF routing through
Mistral OCR breaks LiteLLM parsing, so PDFs must be sent as images).

https://claude.ai/code/session_01Xnfzt91McoMdqaiRv1g6xg

---------

Co-authored-by: Claude <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 16, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8cac8474-219b-4cc7-af2c-b7116ef4340c

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch main

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a multi-source feedback system for task runs, replacing the previous single-field user feedback with a dedicated data model and API. It also adds support for new models including Claude 4.7 and Grok 4.20, renames 'FewShot' components to 'TaskSample', and implements spec archiving and priority levels. Feedback was provided regarding missing multimodal flags for the new Claude model and potential race conditions in the UI's feedback loading state that could lead to a hanging loading indicator or data loss in the view.

Comment on lines 1949 to 1956
name=ModelProviderName.anthropic,
model_id="claude-opus-4-6",
model_id="claude-opus-4-7",
structured_output_mode=StructuredOutputMode.json_schema,
temp_top_p_exclusive=True,
available_thinking_levels=CLAUDE_ANTHROPIC_EFFORT_THINKING_LEVELS,
available_thinking_levels=CLAUDE_OPUS_4_7_ANTHROPIC_THINKING_LEVELS,
default_thinking_level="high",
suggested_for_evals=True,
suggested_for_data_gen=True,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The anthropic provider for Claude Opus 4.7 is missing multimodal capability flags (supports_vision, multimodal_capable, multimodal_mime_types). These were present in the previous version (4.6) and are included in the openrouter provider for 4.7. Without these, multimodal features will be disabled when using the direct Anthropic provider for this model.

            KilnModelProvider(
                name=ModelProviderName.anthropic,
                model_id="claude-opus-4-7",
                structured_output_mode=StructuredOutputMode.json_schema,
                temp_top_p_exclusive=True,
                available_thinking_levels=CLAUDE_OPUS_4_7_ANTHROPIC_THINKING_LEVELS,
                default_thinking_level="high",
                suggested_for_evals=True,
                suggested_for_data_gen=True,
                supports_doc_extraction=True,
                supports_vision=True,
                multimodal_capable=True,
                multimodal_mime_types=[
                    KilnMimeType.PDF,
                    KilnMimeType.TXT,
                    KilnMimeType.MD,
                    KilnMimeType.JPG,
                    KilnMimeType.PNG,
                ],
            ),

Comment on lines +454 to +456
if (request_id === feedback_request_id) {
feedback_loading = false
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The feedback_loading state will hang if submit_feedback is called while a load_feedback request is pending. This happens because submit_feedback increments feedback_request_id, causing the finally block in load_feedback to skip clearing the loading state. You should ensure that feedback_loading is cleared when a request is superseded by a non-loading action like a submission.

},
)
if (fetch_error) throw fetch_error
++feedback_request_id
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Incrementing feedback_request_id here cancels any pending load_feedback GET request. If a user adds feedback immediately after switching to a run (before the initial list loads), the GET request's result will be ignored, and the UI will only show the newly added feedback, losing all existing ones for that run. Consider allowing the GET request to complete and merging the results, or triggering a fresh load after the POST completes.

@github-actions
Copy link
Copy Markdown

📊 Coverage Report

Overall Coverage: 91%

Diff: origin/remote_config...HEAD

  • app/desktop/studio_server/copilot_api.py (66.7%): Missing lines 434
  • app/desktop/studio_server/utils/copilot_utils.py (76.9%): Missing lines 234-238,243
  • libs/core/kiln_ai/datamodel/init.py (100%)
  • libs/core/kiln_ai/datamodel/datamodel_enums.py (100%)
  • libs/core/kiln_ai/datamodel/feedback.py (100%)
  • libs/core/kiln_ai/datamodel/task_run.py (80.0%): Missing lines 155
  • libs/server/kiln_server/feedback_api.py (100%)
  • libs/server/kiln_server/server.py (100%)

Summary

  • Total: 80 lines
  • Missing: 8 lines
  • Coverage: 90%

Line-by-line

View line-by-line diff coverage

app/desktop/studio_server/copilot_api.py

Lines 430-438

  430 
  431             for run in task_runs:
  432                 run.save_to_file()
  433                 saved_models.append(run)
! 434                 dataset_runs.save_pending_feedback(run)
  435 
  436             spec.save_to_file()
  437             saved_models.append(spec)
  438         except Exception:

app/desktop/studio_server/utils/copilot_utils.py

Lines 230-247

  230             self._pending_feedback[task_run.id] = feedback_text
  231 
  232     def save_pending_feedback(self, task_run: TaskRun) -> None:
  233         """Create Feedback children for a saved TaskRun if it has pending feedback."""
! 234         if not task_run.id:
! 235             return
! 236         feedback_text = self._pending_feedback.get(task_run.id)
! 237         if feedback_text:
! 238             fb = Feedback(
  239                 feedback=feedback_text,
  240                 source=FeedbackSource.spec_feedback,
  241                 parent=task_run,
  242             )
! 243             fb.save_to_file()
  244 
  245 
  246 def create_dataset_task_runs(
  247     all_examples: list[SampleApi],

libs/core/kiln_ai/datamodel/task_run.py

Lines 151-159

  151         """
  152         return self.thinking_training_data() is not None
  153 
  154     def feedback(self, readonly: bool = False) -> list[Feedback]:
! 155         return super().feedback(readonly=readonly)  # type: ignore
  156 
  157     # Workaround to return typed parent without importing Task
  158     def parent_task(self) -> Union["Task", None]:
  159         if self.parent is None or self.parent.__class__.__name__ != "Task":


@tawnymanticore tawnymanticore merged commit ded2ce8 into remote_config Apr 16, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants