feat: improve 5 lowest-scoring skill definitions#498
Conversation
|
@GregHolmes check this out! :o |
lukeocodes
left a comment
There was a problem hiding this comment.
love this less-is-more approach!
|
@GregHolmes following up on @lukeocodes's ping from a couple weeks back the same change is shipped across all four Deepgram SDKs:
Happy to wait if there's a release window or batching reason, just wanted to flag in case they slipped off the queue. Let me know if you'd like any tweaks before merge, or who to loop in for the second sign-off on Java/Go. |
|
Hi @rohan-tessl, thank you for taking the time to suggest these changes. The YAML parse fix on text-intelligence is a genuine catch (I confirmed its frontmatter fails to load with js-yaml on main), and the new Workflow / error-handling sections are solid additions. Two things your changes remove that I believe we need to restore in audio-intelligence before this merges, both were removed in this PR:
It's product/runtime knowledge an agent can't infer from the types or examples, and it lives only in this skill — no sibling carries it — so removing it loses it from the skill set entirely. Please keep it in the Gotchas list. Everything else looks good to merge once those two are back. |
94e1553 to
c3fd965
Compare
|
Thanks @GregHolmes both points are fair, will restore them.
Will push the restore in a moment. |
…n gotcha Restore the standard ## Authentication client-setup block so the skill stands on its own when loaded in isolation (the listen.v1 examples below all assume `deepgramClient`). Restore the diarization-quality gotcha — runtime knowledge that doesn't live anywhere else in the skill set. Addresses GregHolmes's review feedback on deepgram#498. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Hey @deepgram 👋
I ran your skills through
tessl skill reviewat work and found targeted improvements in your skills. Here's the before/after:These were easy changes to bring the skill's structure and activation in line with what performs well against Anthropic's best practices. text-intelligence had unquoted YAML with special characters that broke parsing (10% score). Trimmed verbose sections, added workflows and error handling.
In addition, I stress-tested your
deepgram-js-management-apiskill against a few real-world scenarios, and it held up really well. This means that your skill meaningfully improves agent steering and contributes to stronger output quality. Kudos for that!Honest disclosure, I work at @tesslio where we build tooling around skills like these. Not a pitch, just saw room for improvement and wanted to contribute.
If you want to self-improve your skills, or define your own scenarios to pressure test, just ask your agent (Claude Code, Codex, etc.) to evaluate and optimize your skill with Tessl. Ping me @rohan-tessl, if you hit any snags.