Pricing & Billing

Pricing Model

Yunxin uses a pay-as-you-go pricing model. You are charged based on the tokens consumed (for text models) or units generated (for image, audio, video models).

Pricing is transparent and based on the upstream provider costs plus a small service margin.

Token-Based Pricing

For text models (chat, completions, embeddings), pricing is per 1,000 tokens:

Input tokens — the prompt you send
Output tokens — the response generated
Output tokens typically cost more than input tokens

Model Pricing

Pricing varies by model and provider. Check the Models page in the dashboard for current pricing for each model.

Factors that affect pricing:

Model size — larger models cost more per token
Provider — different providers have different pricing
Modality — image, audio, and video generation have per-unit pricing
Features — reasoning tokens (for o-series models) may have separate pricing

Billing Cycle

Charges are calculated in real-time as you make API requests
Usage is tracked per API key and per model
View detailed usage breakdown in the Analytics dashboard

Budget Controls

Set spending limits to prevent unexpected charges:

Navigate to Dashboard → Settings → Billing
Set a monthly budget limit
Configure alert thresholds (e.g., 50%, 80%, 100%)
Choose action when limit is reached: warn or block

Subscription Tiers

Yunxin offers four subscription tiers with increasing benefits:

Tier	Access Level	Priority	Rate Limit
Free	Access to open-source and entry-level models	Low	10 req/min
Basic	Access to standard models from all providers	Normal	50 req/min
Pro	Access to premium models including GPT-4, Claude, Gemini	High	100 req/min
Enterprise	Full access to all models including experimental releases	Critical	Unlimited

Tier Benefits

Each tier provides different benefits:

Feature	Free	Basic	Pro	Enterprise
Priority Level	Low (1)	Normal (2)	High (3)	Critical (4)
Rate Limit	10 req/min	50 req/min	100 req/min	Unlimited
Webhooks	❌	❌	✅	✅

Request Priority

Your subscription tier determines your request priority level:

Low (1) — Free tier requests are processed with standard priority
Normal (2) — Basic tier requests receive normal processing priority
High (3) — Pro tier requests are prioritized over Free and Basic
Critical (4) — Enterprise tier requests receive the highest priority

The priority level is included in API responses via the X-Request-Priority header and is also available in your user profile at /api/auth/me.

Model access is controlled by the min_tier setting on each model. If your account tier is lower than a model's requirement, the API will return a 403 Forbidden error with code model_tier_restricted.

Free Tier

New accounts receive a starter credit to explore the API. This includes:

Access to Free-tier models
Full API functionality
Dashboard and analytics access

To access higher-tier models, upgrade your subscription in Dashboard → Settings → Billing.

Cost Optimization Tips

Use smaller models for simple tasks — check the Models API for available options
Set max_tokens to limit response length
Use Batch API for non-urgent bulk processing
Cache responses for repeated identical requests
Monitor usage in the Analytics dashboard to identify optimization opportunities

Pricing Model

Yunxin uses a pay-as-you-go pricing model. You are charged based on the tokens consumed (for text models) or units generated (for image, audio, video models).

Pricing is transparent and based on the upstream provider costs plus a small service margin.

Token-Based Pricing

For text models (chat, completions, embeddings), pricing is per 1,000 tokens:

Input tokens — the prompt you send
Output tokens — the response generated
Output tokens typically cost more than input tokens

Model Pricing

Pricing varies by model and provider. Check the Models page in the dashboard for current pricing for each model.

Factors that affect pricing:

Model size — larger models cost more per token
Provider — different providers have different pricing
Modality — image, audio, and video generation have per-unit pricing
Features — reasoning tokens (for o-series models) may have separate pricing

Billing Cycle

Charges are calculated in real-time as you make API requests
Usage is tracked per API key and per model
View detailed usage breakdown in the Analytics dashboard

Budget Controls

Set spending limits to prevent unexpected charges:

Navigate to Dashboard → Settings → Billing
Set a monthly budget limit
Configure alert thresholds (e.g., 50%, 80%, 100%)
Choose action when limit is reached: warn or block

Subscription Tiers

Yunxin offers four subscription tiers with increasing benefits:

Tier	Access Level	Priority	Rate Limit
Free	Access to open-source and entry-level models	Low	10 req/min
Basic	Access to standard models from all providers	Normal	50 req/min
Pro	Access to premium models including GPT-4, Claude, Gemini	High	100 req/min
Enterprise	Full access to all models including experimental releases	Critical	Unlimited

Tier Benefits

Each tier provides different benefits:

Feature	Free	Basic	Pro	Enterprise
Priority Level	Low (1)	Normal (2)	High (3)	Critical (4)
Rate Limit	10 req/min	50 req/min	100 req/min	Unlimited
Webhooks	❌	❌	✅	✅

Request Priority

Your subscription tier determines your request priority level:

Low (1) — Free tier requests are processed with standard priority
Normal (2) — Basic tier requests receive normal processing priority
High (3) — Pro tier requests are prioritized over Free and Basic
Critical (4) — Enterprise tier requests receive the highest priority

The priority level is included in API responses via the X-Request-Priority header and is also available in your user profile at /api/auth/me.

Model Access

Free Tier

New accounts receive a starter credit to explore the API. This includes:

Access to Free-tier models
Full API functionality
Dashboard and analytics access

To access higher-tier models, upgrade your subscription in Dashboard → Settings → Billing.

Cost Optimization Tips

Use smaller models for simple tasks — check the Models API for available options
Set max_tokens to limit response length
Use Batch API for non-urgent bulk processing
Cache responses for repeated identical requests
Monitor usage in the Analytics dashboard to identify optimization opportunities

Pricing Model

Token-Based Pricing

Model Pricing

Billing Cycle

Budget Controls

Subscription Tiers

Tier Benefits

Request Priority

Model Access

Free Tier

Cost Optimization Tips

On this page

Pricing & Billing

Pricing Model

Token-Based Pricing

Model Pricing

Billing Cycle

Budget Controls

Subscription Tiers

Tier Benefits

Request Priority

Model Access

Free Tier

Cost Optimization Tips

On this page