VL Embeddings - Qwen3-VL-Embedding-8B
Multimodal embeddings using Qwen3-VL-Embedding-8B. Supports text-only, image-only, or combined text+image inputs.
- Text + Image: one unified embedding capturing both modalities
- Image only: visual embedding for image search/similarity
- Text only: high-quality text embedding with visual grounding
Best for: documents with figures/tables, mixed content, visual RAG pipelines. For pure text at scale, use the Text-Embeddings-8B space instead.
Embedding dimension (Matryoshka)
Routing guide:
- Documents with charts/images → use this space
- Pure text articles → use Text-Embeddings-8B
- Content type unsure → run classifier first
Built by Xavier Fuentes @ AI Enablement Academy