Ultra-fast LLM inference with custom hardware (LPU). OpenAI-compatible with Groq-specific options.
GROQ_API_KEY=gsk_...Passed via :provider_options keyword:
- Type:
"auto"|"on_demand"|"flex"|"performance" - Default:
"auto" - Purpose: Control performance tier for requests
- Example:
provider_options: [service_tier: "performance"]
- Type:
"none"|"default"|"low"|"medium"|"high" - Purpose: Control reasoning level for compatible models
- Compatible: DeepSeek R1 distill models
- Example:
provider_options: [reasoning_effort: "high"]
- Type: String
- Purpose: Specify format for reasoning output
- Example:
provider_options: [reasoning_format: "detailed"]
- Type: Map
- Purpose: Enable web search capabilities
- Keys:
include_domains: List of domains to includeexclude_domains: List of domains to exclude
- Example:
provider_options: [ search_settings: %{ include_domains: ["techcrunch.com", "arstechnica.com"], exclude_domains: ["spam.com"] } ]
- Type: Map
- Purpose: Custom configuration for Compound systems
- Example:
provider_options: [compound_custom: %{...}]
- Streaming: Groq's LPU hardware excels at streaming - tokens appear instantly
- Model Selection: Use
8b-instantfor speed,70bfor quality - Service Tier: Use
"performance"for lowest latency - Concurrency: Handles concurrent requests efficiently