GLM 5.1 Together/FW, and Opus 4.7, Minimax 2.7 on together#1284
GLM 5.1 Together/FW, and Opus 4.7, Minimax 2.7 on together#1284tawnymanticore merged 30 commits intoremote_configfrom
Conversation
Addresses 11 items: add X button to dismiss questions, preserve answers on failed request, add Created At to spec details, allow whitespace while typing spec names (trim on submit), add priority selector in advanced options, fix autoselect badge persistence, rename FewShotSelector to TaskSampleSelector, fine tune page max-width, add Re-run button for review examples, disable copilot when full trace enabled, and add archive/unarchive to spec details. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ssages Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Lift dismissed state to parent like selections/other_texts so dismissals survive component remounts on API failures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Initialize model from ui_state store (localStorage) instead of empty string so the previously selected model is restored on page load. Also fix the saved-config dropdown to show "custom" immediately instead of "Select an option" while configs load. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…intentional Custom selection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the single `user_feedback` string field on TaskRun with a proper Feedback model that supports multiple feedback entries per run. Feedback is a parented model under TaskRun, stored as separate files to avoid write conflicts when multiple people provide feedback. - Add Feedback model (feedback text + FeedbackSource enum) - Make TaskRun a parent model with feedback children - Remove user_feedback field from TaskRun - Add REST API endpoints (list/create) for feedback on task runs - Update copilot models, utils, and frontend spec builder - Create follow-up ticket KIL-537 for repair UI replacement Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The ticket only asked to remove user_feedback from TaskRun, not rename it in the copilot/spec-builder code which uses it for a different purpose. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When creating TaskRuns from reviewed examples in the copilot flow, create Feedback children (with source=spec-feedback) after saving the run, so review feedback is not lost. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
KIL-534 Add Feedback data model on TaskRun
KIL-522 Restore persisted model selection on Run page
Remove all repair UI code (repair form, repair edit form, repair review/accept/delete flows) and replace with a new feedback UI that uses the Feedback data model from KIL-534. - Rename "Output Rating" to "Rating and Feedback" - Add inline feedback list (up to 3, truncated) with "Add Feedback" link - Add "All Feedback" modal with sortable table - Add "Add Feedback" modal using FormContainer - Delete output_repair_edit_form.svelte - Remove model_name/provider/focus_repair_on_appear props from Run Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add request ID tracking and run ID dedup to load_feedback to prevent race conditions and redundant requests when switching runs - Set add_feedback_submitting = true at start of submit_feedback Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
KIL-517 Fix misc spec builder bugs and improvements
KIL-537 Replace repair UI with feedback UI
https://getkiln.slack.com/archives/C0AG8U78MNG/p1776274097954549?thread_ts=1776273210.799549&cid=C0AG8U78MNG Co-authored-by: Claude <noreply@anthropic.com>
* Add Grok 4.20 and Minimax M2.7 TogetherAI provider Added Grok 4.20 (OpenRouter) and TogetherAI provider for Minimax M2.7 to the model list. https://claude.ai/code/session_01S77zSCTFnNW52JiCyWpBoV * Remove reasoning flags from Grok 4.20 Other Grok models on OpenRouter don't set reasoning_capable=True. The model doesn't reliably return reasoning, causing 5 test failures. Removing to match the Kiln pattern for Grok on OpenRouter. https://claude.ai/code/session_01S77zSCTFnNW52JiCyWpBoV * Fix Minimax M2.7 Together AI structured output config The json_schema mode was being ignored by M2.7 on Together AI (model returned plain text instead of JSON). Switch to json_instruction_and_object with reasoning_optional_for_structured_output and optional_r1_thinking parser, matching the M2.5 Together AI config that works reliably. https://claude.ai/code/session_01F1L5ryuY5t2MxQXbNVjQGj --------- Co-authored-by: Claude <noreply@anthropic.com>
…1281) * Update SKILL.md * Update SKILL.md * Update SKILL.md * CR
…ts (#1283) * Update SKILL.md * Update SKILL.md * Update SKILL.md * CR * Update SKILL.md
* Add Claude Opus 4.7 to model list (anthropic, openrouter) Adds Anthropic's new Opus 4.7 model with both Anthropic and OpenRouter providers. Introduces CLAUDE_OPUS_4_7_ANTHROPIC_THINKING_LEVELS to support the new "xhigh" and "max" effort levels exclusive to Opus 4.7. * Apply zero-sum swap: demote Opus 4.6 from suggested/featured Opus 4.7 now carries featured_rank=2, editorial_notes, suggested_for_evals, and suggested_for_data_gen. Removing the same flags from Opus 4.6 keeps the suggested/featured count stable across the Claude Opus family. https://claude.ai/code/session_01Xnfzt91McoMdqaiRv1g6xg * Add PDF support to OpenRouter provider for Opus 4.7 Adds KilnMimeType.PDF to multimodal_mime_types and sets multimodal_requires_pdf_as_image=True (OpenRouter's PDF routing through Mistral OCR breaks LiteLLM parsing, so PDFs must be sent as images). https://claude.ai/code/session_01Xnfzt91McoMdqaiRv1g6xg --------- Co-authored-by: Claude <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a multi-source feedback system for task runs, replacing the previous single-field user feedback with a dedicated data model and API. It also adds support for new models including Claude 4.7 and Grok 4.20, renames 'FewShot' components to 'TaskSample', and implements spec archiving and priority levels. Feedback was provided regarding missing multimodal flags for the new Claude model and potential race conditions in the UI's feedback loading state that could lead to a hanging loading indicator or data loss in the view.
| name=ModelProviderName.anthropic, | ||
| model_id="claude-opus-4-6", | ||
| model_id="claude-opus-4-7", | ||
| structured_output_mode=StructuredOutputMode.json_schema, | ||
| temp_top_p_exclusive=True, | ||
| available_thinking_levels=CLAUDE_ANTHROPIC_EFFORT_THINKING_LEVELS, | ||
| available_thinking_levels=CLAUDE_OPUS_4_7_ANTHROPIC_THINKING_LEVELS, | ||
| default_thinking_level="high", | ||
| suggested_for_evals=True, | ||
| suggested_for_data_gen=True, |
There was a problem hiding this comment.
The anthropic provider for Claude Opus 4.7 is missing multimodal capability flags (supports_vision, multimodal_capable, multimodal_mime_types). These were present in the previous version (4.6) and are included in the openrouter provider for 4.7. Without these, multimodal features will be disabled when using the direct Anthropic provider for this model.
KilnModelProvider(
name=ModelProviderName.anthropic,
model_id="claude-opus-4-7",
structured_output_mode=StructuredOutputMode.json_schema,
temp_top_p_exclusive=True,
available_thinking_levels=CLAUDE_OPUS_4_7_ANTHROPIC_THINKING_LEVELS,
default_thinking_level="high",
suggested_for_evals=True,
suggested_for_data_gen=True,
supports_doc_extraction=True,
supports_vision=True,
multimodal_capable=True,
multimodal_mime_types=[
KilnMimeType.PDF,
KilnMimeType.TXT,
KilnMimeType.MD,
KilnMimeType.JPG,
KilnMimeType.PNG,
],
),| if (request_id === feedback_request_id) { | ||
| feedback_loading = false | ||
| } |
There was a problem hiding this comment.
The feedback_loading state will hang if submit_feedback is called while a load_feedback request is pending. This happens because submit_feedback increments feedback_request_id, causing the finally block in load_feedback to skip clearing the loading state. You should ensure that feedback_loading is cleared when a request is superseded by a non-loading action like a submission.
| }, | ||
| ) | ||
| if (fetch_error) throw fetch_error | ||
| ++feedback_request_id |
There was a problem hiding this comment.
Incrementing feedback_request_id here cancels any pending load_feedback GET request. If a user adds feedback immediately after switching to a run (before the initial list loads), the GET request's result will be ignored, and the UI will only show the newly added feedback, losing all existing ones for that run. Consider allowing the GET request to complete and merging the results, or triggering a fresh load after the POST completes.
📊 Coverage ReportOverall Coverage: 91% Diff: origin/remote_config...HEAD
Summary
Line-by-lineView line-by-line diff coverageapp/desktop/studio_server/copilot_api.pyLines 430-438 430
431 for run in task_runs:
432 run.save_to_file()
433 saved_models.append(run)
! 434 dataset_runs.save_pending_feedback(run)
435
436 spec.save_to_file()
437 saved_models.append(spec)
438 except Exception:app/desktop/studio_server/utils/copilot_utils.pyLines 230-247 230 self._pending_feedback[task_run.id] = feedback_text
231
232 def save_pending_feedback(self, task_run: TaskRun) -> None:
233 """Create Feedback children for a saved TaskRun if it has pending feedback."""
! 234 if not task_run.id:
! 235 return
! 236 feedback_text = self._pending_feedback.get(task_run.id)
! 237 if feedback_text:
! 238 fb = Feedback(
239 feedback=feedback_text,
240 source=FeedbackSource.spec_feedback,
241 parent=task_run,
242 )
! 243 fb.save_to_file()
244
245
246 def create_dataset_task_runs(
247 all_examples: list[SampleApi],libs/core/kiln_ai/datamodel/task_run.pyLines 151-159 151 """
152 return self.thinking_training_data() is not None
153
154 def feedback(self, readonly: bool = False) -> list[Feedback]:
! 155 return super().feedback(readonly=readonly) # type: ignore
156
157 # Workaround to return typed parent without importing Task
158 def parent_task(self) -> Union["Task", None]:
159 if self.parent is None or self.parent.__class__.__name__ != "Task":
|
What does this PR do?
GLM 5.1 Together/FW, and Opus 4.7, Minimax 2.7 on together