Groq
Ultra-fast LPU-accelerated inference.
Overview
Groq provides ultra-fast inference using their custom LPU (Language Processing Unit) hardware, achieving extremely low latency responses.
Official Website: https://groq.com API Documentation: https://console.groq.com/docs
Key Features
- Ultra-Fast Inference — LPU-accelerated for minimal latency
- High Throughput — Excellent for real-time applications
- Open-Source Models — Hosted versions of popular open-source models
- Function Calling — Tool use support
Usage Example
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.yuhuanstudio.com/v1"
)
response = client.chat.completions.create(
model="model-id",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)Available Models
Use the Models API to query available models:
curl https://api.yuhuanstudio.com/v1/models?provider=groq \
-H "Authorization: Bearer YOUR_API_KEY"Models and pricing are synced automatically from Groq. Check the dashboard for current availability and rates.
Official Resources
How is this guide?