Skip to content

Commit 94e1553

Browse files
committed
feat: improve 5 lowest-scoring skill definitions
1 parent c567b98 commit 94e1553

5 files changed

Lines changed: 56 additions & 123 deletions

File tree

.agents/skills/deepgram-js-audio-intelligence/SKILL.md

Lines changed: 25 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -7,16 +7,7 @@ description: Use when writing or reviewing JavaScript/TypeScript in this repo th
77

88
Analytics overlays applied to `/v1/listen`: summaries, topics, intents, sentiment, language detection, diarization, redaction, entities. Same client surface as STT; turn features on with parameters.
99

10-
## When to use this product
11-
12-
- You have **audio** and want analytics returned alongside the transcript.
13-
- REST is the primary path; the WebSocket path supports only a subset of intelligence features.
14-
15-
**Use a different skill when:**
16-
- You just want transcript output → `deepgram-js-speech-to-text`.
17-
- You already have text and want analytics on that text → `deepgram-js-text-intelligence`.
18-
- You need Flux turn-taking → `deepgram-js-conversational-stt`.
19-
- You need a full interactive voice agent → `deepgram-js-voice-agent`.
10+
**Use a different skill when:** plain transcription → `deepgram-js-speech-to-text`; analytics on text → `deepgram-js-text-intelligence`; Flux turn-taking → `deepgram-js-conversational-stt`; full-duplex agent → `deepgram-js-voice-agent`.
2011

2112
## Feature availability: REST vs WSS
2213

@@ -32,18 +23,6 @@ Analytics overlays applied to `/v1/listen`: summaries, topics, intents, sentimen
3223
| `sentiment` | yes | no |
3324
| `detect_language` | yes | no |
3425

35-
## Authentication
36-
37-
```js
38-
require("dotenv").config();
39-
40-
const { DeepgramClient } = require("@deepgram/sdk");
41-
42-
const deepgramClient = new DeepgramClient({
43-
apiKey: process.env.DEEPGRAM_API_KEY,
44-
});
45-
```
46-
4726
## Quick start — REST with analytics
4827

4928
From `examples/22-transcription-advanced-options.ts`:
@@ -70,6 +49,14 @@ const data = await deepgramClient.listen.v1.media.transcribeUrl({
7049
keyterm: ["keyword1", "keyword2"],
7150
redact: ["pci", "ssn"],
7251
});
52+
53+
// Verify intelligence results are present
54+
const summary = data.results?.summary?.short;
55+
const topics = data.results?.topics?.segments;
56+
const sentiments = data.results?.sentiments?.segments;
57+
if (!summary && !topics && !sentiments) {
58+
console.warn("No intelligence results — check feature/model/language support.");
59+
}
7360
```
7461
7562
## Quick start — WSS subset
@@ -85,6 +72,13 @@ const deepgramConnection = await deepgramClient.listen.v1.createConnection({
8572
});
8673
```
8774
75+
## Workflow
76+
77+
1. **Select features** from the REST vs WSS table. WSS lacks `summarize`, `topics`, `intents`, `sentiment`, `detect_language`.
78+
2. **Call** `transcribeUrl` / `transcribeFile` with chosen flags and `model: "nova-3"`.
79+
3. **Validate response**: check `data.results?.summary`, `data.results?.topics?.segments`, `data.results?.sentiments?.segments`. Fields are absent (not errored) when the model/language combo does not support the feature.
80+
4. **On missing results**: confirm the feature/model/language combination at https://developers.deepgram.com/docs/stt-intelligence-feature-overview, then retry with corrected params.
81+
8882
## Key parameters / API surface
8983
9084
- Analytics flags: `summarize`, `topics`, `intents`, `sentiment`, `detect_language`, `detect_entities`, `diarize`, `redact`, `custom_topic`, `custom_topic_mode`, `custom_intent`, `custom_intent_mode`.
@@ -93,28 +87,17 @@ const deepgramConnection = await deepgramClient.listen.v1.createConnection({
9387
9488
## API reference (layered)
9589
96-
1. **In-repo reference**: `reference.md``Listen V1 Media`; WSS subset behavior lives in `src/CustomClient.ts` and `src/api/resources/listen/resources/v1/client/{Client,Socket}.ts`.
97-
2. **Canonical OpenAPI (REST)**: https://developers.deepgram.com/openapi.yaml
98-
3. **Canonical AsyncAPI (WSS)**: https://developers.deepgram.com/asyncapi.yaml
99-
4. **Context7**: library ID `/llmstxt/developers_deepgram_llms_txt`
100-
5. **Product docs**:
101-
- https://developers.deepgram.com/docs/stt-intelligence-feature-overview
102-
- https://developers.deepgram.com/docs/summarization
103-
- https://developers.deepgram.com/docs/topic-detection
104-
- https://developers.deepgram.com/docs/intent-recognition
105-
- https://developers.deepgram.com/docs/sentiment-analysis
106-
- https://developers.deepgram.com/docs/language-detection
107-
- https://developers.deepgram.com/docs/redaction
108-
- https://developers.deepgram.com/docs/diarization
90+
1. **In-repo**: `reference.md``Listen V1 Media`; WSS subset in `src/api/resources/listen/resources/v1/client/{Client,Socket}.ts`.
91+
2. **OpenAPI / AsyncAPI**: https://developers.deepgram.com/openapi.yaml | https://developers.deepgram.com/asyncapi.yaml
92+
3. **Context7**: library ID `/llmstxt/developers_deepgram_llms_txt`
93+
4. **Product docs**: https://developers.deepgram.com/docs/stt-intelligence-feature-overview (links to summarization, topic detection, intent recognition, sentiment, language detection, redaction, diarization).
10994
11095
## Gotchas
11196
112-
1. **`summarize` on `/v1/listen` is versioned, not plain boolean.** The generated REST surface and examples point at `"v2"`.
113-
2. **Most intelligence flags are REST-only.** Current WSS connect args do not expose `topics`, `intents`, `sentiment`, `summarize`, or `detect_language`.
114-
3. **`redact` typing is looser in practice than in the generated alias.** Examples pass arrays like `["pci", "ssn"]`, even though `ListenV1Redact` itself is just a string alias.
115-
4. **Use `keyterm` for Nova-3 biasing.** `examples/22-transcription-advanced-options.ts` explicitly notes keywords are not supported for Nova-3.
116-
5. **Model/feature support is product-side.** `nova-3` is the safest choice when mixing many overlays.
117-
6. **Diarization quality depends on audio quality and duration.** Short or noisy clips churn speakers.
97+
1. **`summarize` is `"v2"`, not boolean.** The generated REST surface and examples use the string value.
98+
2. **`redact` accepts arrays** like `["pci", "ssn"]` despite `ListenV1Redact` being a string alias.
99+
3. **Use `keyterm`, not `keywords`, for Nova-3 biasing.**
100+
4. **Prefer `nova-3`** when mixing many overlays -- broadest feature support.
118101
119102
## Example files in this repo
120103
@@ -125,10 +108,4 @@ const deepgramConnection = await deepgramClient.listen.v1.createConnection({
125108
126109
## Central product skills
127110
128-
For cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:
129-
130-
```bash
131-
npx skills add deepgram/skills
132-
```
133-
134-
This SDK ships language-idiomatic code skills; `deepgram/skills` ships cross-language product knowledge (see `api`, `docs`, `recipes`, `examples`, `starters`, `setup-mcp`).
111+
For cross-language Deepgram product knowledge, install `npx skills add deepgram/skills`.

.agents/skills/deepgram-js-management-api/SKILL.md

Lines changed: 10 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -7,18 +7,7 @@ description: Use when writing or reviewing JavaScript/TypeScript in this repo th
77

88
Administrative REST endpoints under `/v1/projects`, `/v1/models`, and related project subresources.
99

10-
## When to use this product
11-
12-
- **Projects**: list, get, update, delete, leave.
13-
- **Keys**: list, get, create, delete API keys.
14-
- **Members + invites**: inspect members, update scopes, create/delete invites.
15-
- **Usage + billing**: inspect requests, usage, usage breakdown, balances, purchases, billing breakdown.
16-
- **Models**: list global models and project-scoped models.
17-
- **Agent think models**: discover available model providers for Voice Agent `think` settings.
18-
19-
**Use a different skill when:**
20-
- You want to run a live websocket agent session → `deepgram-js-voice-agent`.
21-
- You want transcription or synthesis calls rather than project/admin APIs → product-specific skills.
10+
**Use a different skill when:** live agent session → `deepgram-js-voice-agent`; transcription or synthesis → product-specific skills.
2211

2312
## Authentication
2413

@@ -80,6 +69,13 @@ Think-model discovery for Voice Agent:
8069
await deepgramClient.agent.v1.settings.think.models.list();
8170
```
8271

72+
## Workflow for destructive operations
73+
74+
1. **List** the resources first (e.g., `projects.keys.list(projectId)`).
75+
2. **Confirm** the target ID with the user before proceeding.
76+
3. **Execute** the delete/leave/remove call.
77+
4. **Verify** by listing again to confirm deletion.
78+
8379
## Key parameters / API surface
8480

8581
- Projects: `client.manage.v1.projects.list/get/update/delete/leave`.
@@ -119,24 +115,8 @@ The current JS SDK does **not** expose persisted Voice Agent configuration CRUD
119115

120116
## Example files in this repo
121117

122-
- `examples/13-management-projects.ts`
123-
- `examples/14-management-keys.ts`
124-
- `examples/15-management-members.ts`
125-
- `examples/16-management-invites.ts`
126-
- `examples/17-management-usage.ts`
127-
- `examples/18-management-billing.ts`
128-
- `examples/19-management-models.ts`
129-
- `examples/29-management-usage-breakdown.ts`
130-
- `examples/30-management-billing-detailed.ts`
131-
- `examples/31-management-member-permissions.ts`
132-
- `examples/32-management-project-models.ts`
118+
`examples/13-management-projects.ts` through `examples/19-management-models.ts`, plus `examples/29-32-*` for usage breakdown, billing details, member permissions, and project models.
133119

134120
## Central product skills
135121

136-
For cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:
137-
138-
```bash
139-
npx skills add deepgram/skills
140-
```
141-
142-
This SDK ships language-idiomatic code skills; `deepgram/skills` ships cross-language product knowledge (see `api`, `docs`, `recipes`, `examples`, `starters`, `setup-mcp`).
122+
For cross-language Deepgram product knowledge, install `npx skills add deepgram/skills`.

.agents/skills/deepgram-js-text-intelligence/SKILL.md

Lines changed: 3 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,13 @@
11
---
22
name: deepgram-js-text-intelligence
3-
description: Use when writing or reviewing JavaScript/TypeScript in this repo that calls Deepgram Text Intelligence / Read (`/v1/read`) for sentiment, summarization, topic detection, and intent recognition on text input. Covers `client.read.v1.text.analyze(...)` with `body: { text }` or `body: { url }`. Use `deepgram-js-audio-intelligence` when the source is audio instead of text. Triggers include "read API", "text intelligence", "analyze text", "sentiment", "summarize text", "topics", "intents", and "read.v1".
3+
description: "Use when writing or reviewing JavaScript/TypeScript in this repo that calls Deepgram Text Intelligence / Read (`/v1/read`) for sentiment, summarization, topic detection, and intent recognition on text input. Covers `client.read.v1.text.analyze(...)` with `body: { text }` or `body: { url }`. Use `deepgram-js-audio-intelligence` when the source is audio instead of text. Triggers: read API, text intelligence, analyze text, sentiment, summarize text, topics, intents, read.v1."
44
---
55

66
# Using Deepgram Text Intelligence (JavaScript / TypeScript SDK)
77

88
Analyze text or a hosted text URL for sentiment, summarization, topics, and intents via `/v1/read`.
99

10-
## When to use this product
11-
12-
- You already have **text** (transcript, document, email, chat log) and want analytics.
13-
- You want a single REST call; there is no streaming Read API in this SDK.
14-
15-
**Use a different skill when:**
16-
- Your source is audio and you want the analytics applied during transcription → `deepgram-js-audio-intelligence`.
10+
**Use a different skill when:** source is audio → `deepgram-js-audio-intelligence`. This API is REST-only; there is no streaming Read API in this SDK.
1711

1812
## Authentication
1913

@@ -85,10 +79,4 @@ For broader coverage, `examples/28-text-intelligence-advanced.ts` also demonstra
8579

8680
## Central product skills
8781

88-
For cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:
89-
90-
```bash
91-
npx skills add deepgram/skills
92-
```
93-
94-
This SDK ships language-idiomatic code skills; `deepgram/skills` ships cross-language product knowledge (see `api`, `docs`, `recipes`, `examples`, `starters`, `setup-mcp`).
82+
For cross-language Deepgram product knowledge, install `npx skills add deepgram/skills`.

.agents/skills/deepgram-js-text-to-speech/SKILL.md

Lines changed: 5 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -7,13 +7,9 @@ description: Use when writing or reviewing JavaScript/TypeScript in this repo th
77

88
Convert text to audio with one-shot REST generation or low-latency streaming synthesis via `/v1/speak`.
99

10-
## When to use this product
10+
Two modes: **REST** (`client.speak.v1.audio.generate`) for one-shot synthesis, **WebSocket** (`client.speak.v1.createConnection()`) for low-latency streaming.
1111

12-
- **REST (`client.speak.v1.audio.generate`)** — render finished text into an audio response. Best for downloadable files, pre-generated prompts, batch synthesis.
13-
- **WebSocket (`client.speak.v1.createConnection()` / `connect()`)** — stream text in and receive audio out with lower latency. Best when an LLM is still producing tokens.
14-
15-
**Use a different skill when:**
16-
- You need the agent to also listen, think, and handle barge-in → `deepgram-js-voice-agent`.
12+
**Use a different skill when:** full-duplex agent with STT + LLM + TTS → `deepgram-js-voice-agent`.
1713

1814
## Authentication
1915

@@ -71,6 +67,8 @@ deepgramConnection.sendText({ type: "Speak", text: "Hello from streaming TTS." }
7167
deepgramConnection.sendFlush({ type: "Flush" });
7268
```
7369

70+
**Error handling:** Listen for `Warning` events in the message handler. If the connection drops, create a new connection and re-register handlers; the SDK does not auto-reconnect.
71+
7472
## Key parameters / API surface
7573

7674
- REST & WSS: `model`, `encoding`, `sample_rate`, `container`, `bit_rate`, `callback`, `callback_method`, `tag`, `mip_opt_out`.
@@ -111,10 +109,4 @@ Unlike the Python SDK, this repo does **not** include a hand-written `TextBuilde
111109

112110
## Central product skills
113111

114-
For cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:
115-
116-
```bash
117-
npx skills add deepgram/skills
118-
```
119-
120-
This SDK ships language-idiomatic code skills; `deepgram/skills` ships cross-language product knowledge (see `api`, `docs`, `recipes`, `examples`, `starters`, `setup-mcp`).
112+
For cross-language Deepgram product knowledge, install `npx skills add deepgram/skills`.

.agents/skills/deepgram-js-voice-agent/SKILL.md

Lines changed: 13 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -7,16 +7,7 @@ description: Use when writing or reviewing JavaScript/TypeScript in this repo th
77

88
Full-duplex voice agent runtime over `wss://agent.deepgram.com/v1/agent/converse`: audio in, LLM orchestration, audio out, plus function calling and prompt/runtime updates.
99

10-
## When to use this product
11-
12-
- You want an **interactive voice assistant** where the user speaks, the agent thinks, and the agent responds with speech.
13-
- You need **function / tool calling** inside the conversation loop.
14-
- You want Deepgram to host the STT + think + TTS orchestration.
15-
16-
**Use a different skill when:**
17-
- You only need transcription → `deepgram-js-speech-to-text` or `deepgram-js-conversational-stt`.
18-
- You only need synthesis → `deepgram-js-text-to-speech`.
19-
- You want project keys, usage, models, or other admin APIs → `deepgram-js-management-api`.
10+
**Use a different skill when:** transcription only → `deepgram-js-speech-to-text` or `deepgram-js-conversational-stt`; synthesis only → `deepgram-js-text-to-speech`; admin APIs → `deepgram-js-management-api`.
2011

2112
## Authentication
2213

@@ -72,6 +63,17 @@ deepgramConnection.sendSettings({
7263

7364
The same example also shows `client.agent.v1.settings.think.models.list()` for discovering supported think models.
7465

66+
## Workflow
67+
68+
1. `createConnection()` — returns a lazy socket; no network call yet.
69+
2. Register `on("message", ...)` handlers for `SettingsApplied`, `ConversationText`, `FunctionCallRequest`, `Error`, and audio payloads.
70+
3. `connect()` then `await waitForOpen()`.
71+
4. `sendSettings({ type: "Settings", ... })`**must be the first message**. Wait for `SettingsApplied` before proceeding.
72+
5. `sendMedia(chunk)` to stream user audio. Send `sendKeepAlive(...)` every ~5 s during silence.
73+
6. Handle `FunctionCallRequest` with `sendFunctionCallResponse({ type: "FunctionCallResponse", id, name, content })`.
74+
7. Use `sendUpdatePrompt(...)`, `sendUpdateThink(...)`, `sendUpdateSpeak(...)` for runtime changes.
75+
8. On `Error` event, log the error and close/reconnect as appropriate.
76+
7577
## Key parameters / API surface
7678

7779
- Connection setup: `client.agent.v1.createConnection()` / `connect()`.
@@ -114,10 +116,4 @@ This SDK exposes the **live agent runtime** plus `settings.think.models.list()`,
114116

115117
## Central product skills
116118

117-
For cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:
118-
119-
```bash
120-
npx skills add deepgram/skills
121-
```
122-
123-
This SDK ships language-idiomatic code skills; `deepgram/skills` ships cross-language product knowledge (see `api`, `docs`, `recipes`, `examples`, `starters`, `setup-mcp`).
119+
For cross-language Deepgram product knowledge, install `npx skills add deepgram/skills`.

0 commit comments

Comments
 (0)