Skip to content

Commit 716e4cd

Browse files
authored
Merge pull request #39 from deepgram/lo/add-agent-skills
chore: add agent skills for product usage and maintenance [no-ci]
2 parents df02cdc + 3d3677f commit 716e4cd

9 files changed

Lines changed: 1100 additions & 0 deletions

File tree

Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
---
2+
name: deepgram-java-audio-intelligence
3+
description: Use when writing or reviewing Java code in this repo that enables Deepgram intelligence overlays on `/v1/listen` audio transcription - diarization, entity detection, sentiment, summarize, topics, intents, language detection, and redaction. Same endpoint as plain STT, but with extra request fields on `ListenV1RequestUrl` or `MediaTranscribeRequestOctetStream`. Use `deepgram-java-speech-to-text` for plain transcripts and `deepgram-java-text-intelligence` for analysis on existing text. Triggers include "audio intelligence", "diarize", "summarize audio", "sentiment from audio", "topic detection", and "redact".
4+
---
5+
6+
# Using Deepgram Audio Intelligence (Java SDK)
7+
8+
Audio intelligence is not a separate client in this SDK. It is the **Listen V1 REST request surface** with additional analysis fields enabled.
9+
10+
## When to use this product
11+
12+
- You have **audio** and want transcript + analysis together.
13+
- REST is the main path; the Java WebSocket client only exposes the real-time subset.
14+
15+
**Use a different skill when:**
16+
- You want plain transcription only → `deepgram-java-speech-to-text`.
17+
- You already have text and only need text analysis → `deepgram-java-text-intelligence`.
18+
- You need turn-aware conversational streaming → `deepgram-java-conversational-stt`.
19+
20+
## Authentication
21+
22+
```java
23+
import com.deepgram.DeepgramClient;
24+
25+
DeepgramClient client = DeepgramClient.builder()
26+
.apiKey(System.getenv("DEEPGRAM_API_KEY"))
27+
.build();
28+
```
29+
30+
## Quick start — REST with repo-backed example pattern
31+
32+
```java
33+
import com.deepgram.resources.listen.v1.media.requests.ListenV1RequestUrl;
34+
import com.deepgram.resources.listen.v1.media.types.MediaTranscribeRequestModel;
35+
import com.deepgram.resources.listen.v1.media.types.MediaTranscribeResponse;
36+
37+
ListenV1RequestUrl request = ListenV1RequestUrl.builder()
38+
.url("https://dpgr.am/spacewalk.wav")
39+
.model(MediaTranscribeRequestModel.NOVA3)
40+
.smartFormat(true)
41+
.punctuate(true)
42+
.diarize(true)
43+
.language("en-US")
44+
.build();
45+
46+
MediaTranscribeResponse result = client.listen().v1().media().transcribeUrl(request);
47+
```
48+
49+
The concrete repo example (`examples/listen/AdvancedOptions.java`) demonstrates the same pattern for enabling higher-value Listen options via the builder.
50+
51+
## What else the REST request surface supports
52+
53+
The generated `ListenV1RequestUrl` and `MediaTranscribeRequestOctetStream` classes also expose these verified analysis fields in this checkout:
54+
55+
- `sentiment`
56+
- `summarize`
57+
- `topics`
58+
- `customTopic`
59+
- `customTopicMode`
60+
- `intents`
61+
- `customIntent`
62+
- `customIntentMode`
63+
- `detectEntities`
64+
- `detectLanguage`
65+
- `diarize`
66+
- `redact`
67+
68+
## Quick start — WebSocket subset
69+
70+
```java
71+
import com.deepgram.resources.listen.v1.websocket.V1ConnectOptions;
72+
import com.deepgram.resources.listen.v1.websocket.V1WebSocketClient;
73+
import com.deepgram.types.ListenV1Model;
74+
import java.util.concurrent.TimeUnit;
75+
76+
V1WebSocketClient wsClient = client.listen().v1().v1WebSocket();
77+
wsClient.onResults(result -> System.out.println(result));
78+
79+
wsClient.connect(V1ConnectOptions.builder()
80+
.model(ListenV1Model.NOVA3)
81+
.diarize(true)
82+
.build())
83+
.get(10, TimeUnit.SECONDS);
84+
```
85+
86+
In this Java checkout, the WebSocket connect options include `diarize`, `detectEntities`, `redact`, and the normal streaming transcription controls, but **not** `summarize`, `topics`, `intents`, or `detectLanguage`.
87+
88+
## Key parameters / API surface
89+
90+
- REST builders: `ListenV1RequestUrl` and `MediaTranscribeRequestOctetStream`
91+
- REST analysis fields verified in source: `sentiment`, `summarize`, `topics`, `customTopic`, `customTopicMode`, `intents`, `customIntent`, `customIntentMode`, `detectEntities`, `detectLanguage`, `diarize`, `redact`
92+
- Helpful transcription companions: `smartFormat`, `punctuate`, `paragraphs`, `utterances`, `numerals`, `keywords`, `keyterm`, `replace`, `search`
93+
- WebSocket subset: `diarize`, `detectEntities`, `redact`, plus standard live transcription options
94+
95+
## API reference (layered)
96+
97+
1. **In-repo source of truth**: `src/main/java/com/deepgram/resources/listen/v1/media/requests/` and `src/main/java/com/deepgram/resources/listen/v1/websocket/` plus `examples/listen/AdvancedOptions.java`. `reference.md` is absent here.
98+
2. **Canonical OpenAPI (REST)**: https://developers.deepgram.com/openapi.yaml
99+
3. **Canonical AsyncAPI (WSS subset)**: https://developers.deepgram.com/asyncapi.yaml
100+
4. **Context7**: `/llmstxt/developers_deepgram_llms_txt`
101+
5. **Product docs**:
102+
- https://developers.deepgram.com/docs/stt-intelligence-feature-overview
103+
- https://developers.deepgram.com/docs/summarization
104+
- https://developers.deepgram.com/docs/topic-detection
105+
- https://developers.deepgram.com/docs/intent-recognition
106+
- https://developers.deepgram.com/docs/sentiment-analysis
107+
- https://developers.deepgram.com/docs/language-detection
108+
- https://developers.deepgram.com/docs/redaction
109+
- https://developers.deepgram.com/docs/diarization
110+
111+
## Gotchas
112+
113+
1. **There is no separate “audio intelligence client”.** Everything hangs off Listen V1.
114+
2. **Most intelligence fields are REST-only in this SDK surface.** The WebSocket connect options do not expose `summarize`, `topics`, `intents`, or `detectLanguage`.
115+
3. **`summarize` on Listen V1 is its own generated type.** Do not assume the Read API shape is identical.
116+
4. **The repo example only demonstrates diarization-level options.** There is no dedicated example file for sentiment/topics/intents in this checkout.
117+
5. **`redact` is currently a single `String` field on the REST builders.** Do not assume Python-style string-or-list support here.
118+
6. **Model support matters.** The examples consistently use `NOVA3`; follow that unless you have verified another model supports the overlays you need.
119+
7. **These fields live on both URL and byte-upload request builders.** Pick the builder that matches your input source.
120+
121+
## Example files in this repo
122+
123+
- `examples/listen/AdvancedOptions.java`
124+
- `examples/listen/TranscribeUrl.java`
125+
- `examples/listen/FileUploadTypes.java`
126+
127+
## Central product skills
128+
129+
For cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:
130+
131+
```bash
132+
npx skills add deepgram/skills
133+
```
134+
135+
This SDK ships language-idiomatic code skills; `deepgram/skills` ships cross-language product knowledge (see `api`, `docs`, `recipes`, `examples`, `starters`, `setup-mcp`).
Lines changed: 102 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,102 @@
1+
---
2+
name: deepgram-java-conversational-stt
3+
description: Use when writing or reviewing Java code in this repo that calls Deepgram Conversational STT v2 / Flux over `/v2/listen`. Covers `client.listen().v2().v2WebSocket()`, `V2ConnectOptions`, `onTurnInfo`, and turn-aware close handling. Use `deepgram-java-speech-to-text` for standard v1 transcription and `deepgram-java-voice-agent` for fully interactive assistants. Triggers include "flux", "conversational stt", "listen v2", "turn detection", "end of turn", and "eot".
4+
---
5+
6+
# Using Deepgram Conversational STT / Flux (Java SDK)
7+
8+
Turn-aware streaming transcription over `/v2/listen` for conversational audio.
9+
10+
## When to use this product
11+
12+
- You want explicit turn events, not just regular interim/final transcript chunks.
13+
- You are building conversational UX where end-of-turn timing matters.
14+
15+
**Use a different skill when:**
16+
- You need general-purpose STT over REST or classic streaming → `deepgram-java-speech-to-text`.
17+
- You need a hosted interactive assistant → `deepgram-java-voice-agent`.
18+
19+
## Authentication
20+
21+
```java
22+
import com.deepgram.DeepgramClient;
23+
24+
DeepgramClient client = DeepgramClient.builder()
25+
.apiKey(System.getenv("DEEPGRAM_API_KEY"))
26+
.build();
27+
```
28+
29+
## Quick start
30+
31+
```java
32+
import com.deepgram.resources.listen.v2.types.ListenV2CloseStream;
33+
import com.deepgram.resources.listen.v2.types.ListenV2CloseStreamType;
34+
import com.deepgram.resources.listen.v2.websocket.V2ConnectOptions;
35+
import com.deepgram.resources.listen.v2.websocket.V2WebSocketClient;
36+
import java.util.concurrent.TimeUnit;
37+
38+
V2WebSocketClient wsClient = client.listen().v2().v2WebSocket();
39+
40+
wsClient.onConnected(connected ->
41+
System.out.println("request_id=" + connected.getRequestId()));
42+
43+
wsClient.onTurnInfo(turnInfo -> {
44+
System.out.printf("[%s] turn=%.0f transcript=\"%s\"%n",
45+
turnInfo.getEvent(),
46+
turnInfo.getTurnIndex(),
47+
turnInfo.getTranscript());
48+
});
49+
50+
wsClient.connect(V2ConnectOptions.builder()
51+
.model("flux-general-en")
52+
.build())
53+
.get(10, TimeUnit.SECONDS);
54+
55+
// wsClient.sendMedia(okio.ByteString.of(audioChunk));
56+
57+
wsClient.sendCloseStream(ListenV2CloseStream.builder()
58+
.type(ListenV2CloseStreamType.CLOSE_STREAM)
59+
.build());
60+
```
61+
62+
## Key parameters / API surface
63+
64+
- Entry point: `client.listen().v2().v2WebSocket()`
65+
- Required connect field: `model(String)`
66+
- Verified connect options in source: `encoding`, `sampleRate`, `eagerEotThreshold`, `eotThreshold`, `eotTimeoutMs`, `keyterm`, `mipOptOut`, `tag`
67+
- Send methods: `sendMedia(...)`, `sendCloseStream(...)`
68+
- Event handlers: `onConnected(Consumer<ListenV2Connected>)`, `onTurnInfo(...)`, `onErrorMessage(...)`, plus generic connection/error hooks
69+
70+
## API reference (layered)
71+
72+
1. **In-repo source of truth**: `src/main/java/com/deepgram/resources/listen/v2/` and `examples/listen/LiveStreamingV2.java`. No `reference.md` exists in this checkout.
73+
2. **Canonical AsyncAPI**: https://developers.deepgram.com/asyncapi.yaml
74+
3. **Context7**: `/llmstxt/developers_deepgram_llms_txt`
75+
4. **Product docs**:
76+
- https://developers.deepgram.com/reference/speech-to-text/listen-flux
77+
- https://developers.deepgram.com/docs/flux/quickstart
78+
- https://developers.deepgram.com/docs/flux/language-prompting
79+
80+
## Gotchas
81+
82+
1. **This is WebSocket-only in the Java SDK.** There is no REST helper for `/v2/listen` here.
83+
2. **`model` is a plain `String`, not an enum.** Use Flux model IDs such as `flux-general-en` exactly.
84+
3. **Close with `sendCloseStream(...)`, not Listen V1 finalize.** The message type is different from v1.
85+
4. **The current Java connect options do not expose `language_hint`.** Do not assume the Python surface exists here.
86+
5. **Turn events are the main payload.** Handle `onTurnInfo(...)`, not Listen V1 `onResults(...)`.
87+
6. **You still need to stream binary audio manually.** The example only wires handlers and close flow.
88+
7. **Wait for `connect(...).get(...)` before sending media.** The client is async but not fire-and-forget.
89+
90+
## Example files in this repo
91+
92+
- `examples/listen/LiveStreamingV2.java`
93+
94+
## Central product skills
95+
96+
For cross-language Deepgram product knowledge — the consolidated API reference, documentation finder, focused runnable recipes, third-party integration examples, and MCP setup — install the central skills:
97+
98+
```bash
99+
npx skills add deepgram/skills
100+
```
101+
102+
This SDK ships language-idiomatic code skills; `deepgram/skills` ships cross-language product knowledge (see `api`, `docs`, `recipes`, `examples`, `starters`, `setup-mcp`).
Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
---
2+
name: deepgram-java-maintaining-sdk
3+
description: Use when regenerating this Java SDK with Fern, editing `.fernignore`, preparing the repo for a generator release, reconciling manual patches after regen, or deciding whether a file is permanently frozen vs temporarily frozen. This SDK is Fern-generated - most files under `src/main/java/com/deepgram/` should not be edited directly. Triggers include "fern regen", "regenerate sdk", ".fernignore", "unfreeze", "re-apply patches", and "sdk regeneration".
4+
---
5+
6+
# Maintaining the Deepgram Java SDK
7+
8+
This SDK is generated by [Fern](https://buildwithfern.com/). Most files under `src/main/java/com/deepgram/` are auto-generated and should not be edited directly. Some files are hand-written or manually patched and are listed in `.fernignore` so Fern does not overwrite them.
9+
10+
## Freeze classification rules
11+
12+
Every entry in `.fernignore` falls into one of two categories. The comment above each entry is authoritative, but when in doubt use these rules.
13+
14+
### Never unfreeze (permanently frozen)
15+
16+
These files are hand-written or maintained independently from Fern. They must stay in `.fernignore`.
17+
18+
How to identify:
19+
20+
- Custom wrapper/client code written by the repo maintainers
21+
- Transport abstractions or other hand-built infrastructure
22+
- Build files, docs, tests, examples, CI/config artifacts
23+
- Anything outside the generated Java package tree that Fern should not own
24+
25+
Current permanently frozen files and directories:
26+
27+
- `src/main/java/com/deepgram/DeepgramClient.java`
28+
- `src/main/java/com/deepgram/AsyncDeepgramClient.java`
29+
- `src/main/java/com/deepgram/DeepgramClientBuilder.java`
30+
- `src/main/java/com/deepgram/AsyncDeepgramClientBuilder.java`
31+
- `src/main/java/com/deepgram/core/transport/`
32+
- `build.gradle`, `settings.gradle`, `gradle/`, `gradlew`, `gradlew.bat`, `pom.xml`, `Makefile`
33+
- `README.md`, `CHANGELOG.md`, `CONTRIBUTING.md`, `LICENSE`
34+
- `src/test/`
35+
- `examples/`
36+
- `.editorconfig`, `.githooks/`, `.github/`, `.gitignore`
37+
- `target/`
38+
- `CLAUDE.md`, `AGENTS.md`, `.claude/`, `.agents/`
39+
40+
Also note the defensive flat-path `.fernignore` entries:
41+
42+
- `src/main/java/DeepgramClient.java`
43+
- `src/main/java/AsyncDeepgramClient.java`
44+
- `src/main/java/DeepgramClientBuilder.java`
45+
- `src/main/java/AsyncDeepgramClientBuilder.java`
46+
47+
Those flat-path files do **not** exist in this checkout. They are layout guards for alternate local generation layouts, not active source files.
48+
49+
### Unfreeze for regen (temporarily frozen)
50+
51+
These files are Fern-generated but still carry local fixes. Unfreeze them before a regen so the generator can rewrite the original path and you can diff the new output against your patched copy.
52+
53+
How to identify:
54+
55+
- Fern would regenerate the file if it were removed from `.fernignore`
56+
- The checked-in version is a patched copy of generator output
57+
58+
Current temporarily frozen files:
59+
60+
- `src/main/java/com/deepgram/core/ClientOptions.java` — preserves release-please version markers plus correct `User-Agent`, `X-Fern-SDK-Name`, and `X-Fern-SDK-Version` constants that Fern currently overwrites
61+
62+
## Prepare repo for regeneration
63+
64+
1. Create a branch from `main` named `lo/sdk-gen-<YYYY-MM-DD>`.
65+
2. Push it and open a PR titled `chore: SDK regeneration <YYYY-MM-DD>`.
66+
3. Read `.fernignore` and classify every entry.
67+
4. For each **temporarily frozen** file only:
68+
- Copy it to `<filename>.bak` beside the original.
69+
- In `.fernignore`, replace the original path with the `.bak` path.
70+
5. Leave **permanently frozen** entries untouched.
71+
6. Commit as `chore: unfreeze files pending regen` and push.
72+
7. Fern can now regenerate the original paths.
73+
74+
## After regeneration
75+
76+
The `.bak` files preserve the old patched versions. The original paths now contain fresh generator output.
77+
78+
1. Diff each `.bak` file against the regenerated original.
79+
2. Re-apply only the patches that are still needed.
80+
3. In `.fernignore`, replace each `.bak` path back to the original path for files that still need local patches.
81+
4. Remove `.fernignore` entries entirely for files where Fern now generates the correct output.
82+
5. Delete all `.bak` files.
83+
6. Run verification:
84+
```bash
85+
./gradlew test
86+
./gradlew compileExamples
87+
mvn test
88+
```
89+
Use `mvn verify` only when you also want the Maven Failsafe integration-test phase (`**/IntegrationTest*`) to run.
90+
7. Commit as `chore: re-apply manual patches after regen` and push.
91+
92+
## Java-specific notes
93+
94+
- The custom builders add **Bearer-token support**, **auto session ID headers**, and **transportFactory(...)** hooks on top of Fern's generated API client.
95+
- `ClientOptions.java` is the only currently documented temporary patch point.
96+
- The transport abstraction under `src/main/java/com/deepgram/core/transport/` is permanently hand-maintained.
97+
- `examples/` is permanently frozen and is also used as the main source of truth for skill authoring because this checkout does not include `reference.md`.
98+
- `sample-app/` is **not** listed in `.fernignore`, so it does not currently appear frozen.
99+
- `build.gradle` intentionally excludes three manage examples from `compileExamples`: `manage/ListModels.java`, `manage/MemberPermissions.java`, and `manage/UsageBreakdown.java`.
100+
101+
## Source-of-truth note
102+
103+
`AGENTS.md` in the repo root and this skill should stay synchronized. If the regeneration workflow changes, update both.
104+
105+
## Example files in this repo
106+
107+
- `AGENTS.md`
108+
- `.fernignore`
109+
- `build.gradle`
110+
- `pom.xml`
111+
- `src/main/java/com/deepgram/DeepgramClientBuilder.java`
112+
- `src/main/java/com/deepgram/core/ClientOptions.java`
113+
- `src/main/java/com/deepgram/core/transport/`

0 commit comments

Comments
 (0)