# Vision (Multimodal) (/docs/vision)


Overview [#overview]

Vision-capable models can analyze images and answer questions about them. Yunxin supports multimodal inputs through the same Chat Completions API.

Sending Images [#sending-images]

Include images in the `content` array using the `image_url` type:

```json
{
  "model": "model-id",
  "messages": [
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "What's in this image?"},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://example.com/photo.jpg"
          }
        }
      ]
    }
  ]
}
```

Image Formats [#image-formats]

| Format          | Supported       |
| --------------- | --------------- |
| URL (HTTPS)     | Yes             |
| Base64 data URI | Yes             |
| Local file path | No (use base64) |

Base64 Encoding [#base64-encoding]

```python
import base64

with open("image.png", "rb") as f:
    base64_image = base64.b64encode(f.read()).decode("utf-8")

response = client.chat.completions.create(
    model="model-id",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this image."},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/png;base64,{base64_image}"
                    }
                }
            ]
        }
    ]
)
```

Multiple Images [#multiple-images]

Send multiple images in a single request:

```python
response = client.chat.completions.create(
    model="model-id",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Compare these two images."},
                {"type": "image_url", "image_url": {"url": "https://example.com/image1.jpg"}},
                {"type": "image_url", "image_url": {"url": "https://example.com/image2.jpg"}}
            ]
        }
    ]
)
```

Image Detail Level [#image-detail-level]

Control the resolution for analysis:

```json
{
  "type": "image_url",
  "image_url": {
    "url": "https://example.com/photo.jpg",
    "detail": "high"
  }
}
```

| Detail | Description                    | Token Usage |
| ------ | ------------------------------ | ----------- |
| `low`  | 512×512 fixed                  | Lower       |
| `high` | Full resolution (up to 2048px) | Higher      |
| `auto` | Model decides                  | Varies      |

Vision-Capable Models [#vision-capable-models]

<Callout>
  Not all models support vision. Use the `GET /v1/models` endpoint and check for the `vision` capability to find models that support image inputs.
</Callout>
