Pricing & Billing
Understand Yunxin's pricing model and billing system.
Pricing Model
Yunxin uses a pay-as-you-go pricing model. You are charged based on the tokens consumed (for text models) or units generated (for image, audio, video models).
Pricing is transparent and based on the upstream provider costs plus a small service margin.
Token-Based Pricing
For text models (chat, completions, embeddings), pricing is per 1,000 tokens:
- Input tokens — the prompt you send
- Output tokens — the response generated
- Output tokens typically cost more than input tokens
Model Pricing
Pricing varies by model and provider. Check the Models page in the dashboard for current pricing for each model.
Factors that affect pricing:
- Model size — larger models cost more per token
- Provider — different providers have different pricing
- Modality — image, audio, and video generation have per-unit pricing
- Features — reasoning tokens (for o-series models) may have separate pricing
Billing Cycle
- Charges are calculated in real-time as you make API requests
- Usage is tracked per API key and per model
- View detailed usage breakdown in the Analytics dashboard
Budget Controls
Set spending limits to prevent unexpected charges:
- Navigate to Dashboard → Settings → Billing
- Set a monthly budget limit
- Configure alert thresholds (e.g., 50%, 80%, 100%)
- Choose action when limit is reached: warn or block
Subscription Tiers
Yunxin offers four subscription tiers with increasing benefits:
| Tier | Access Level | Priority | Rate Limit |
|---|---|---|---|
| Free | Access to open-source and entry-level models | Low | 10 req/min |
| Basic | Access to standard models from all providers | Normal | 50 req/min |
| Pro | Access to premium models including GPT-4, Claude, Gemini | High | 100 req/min |
| Enterprise | Full access to all models including experimental releases | Critical | Unlimited |
Tier Benefits
Each tier provides different benefits:
| Feature | Free | Basic | Pro | Enterprise |
|---|---|---|---|---|
| Priority Level | Low (1) | Normal (2) | High (3) | Critical (4) |
| Rate Limit | 10 req/min | 50 req/min | 100 req/min | Unlimited |
| Webhooks | ❌ | ❌ | ✅ | ✅ |
Request Priority
Your subscription tier determines your request priority level:
- Low (1) — Free tier requests are processed with standard priority
- Normal (2) — Basic tier requests receive normal processing priority
- High (3) — Pro tier requests are prioritized over Free and Basic
- Critical (4) — Enterprise tier requests receive the highest priority
The priority level is included in API responses via the X-Request-Priority header and is also available in your user profile at /api/auth/me.
Model Access
Model access is controlled by the min_tier setting on each model. If your account tier is lower than a model's requirement, the API will return a 403 Forbidden error with code model_tier_restricted.
Free Tier
New accounts receive a starter credit to explore the API. This includes:
- Access to Free-tier models
- Full API functionality
- Dashboard and analytics access
To access higher-tier models, upgrade your subscription in Dashboard → Settings → Billing.
Cost Optimization Tips
- Use smaller models for simple tasks — check the Models API for available options
- Set
max_tokensto limit response length - Use Batch API for non-urgent bulk processing
- Cache responses for repeated identical requests
- Monitor usage in the Analytics dashboard to identify optimization opportunities
How is this guide?