Our Large Language Model as a Service (LLMaaS) offering gives you access to cutting-edge language models, inferred using SecNumCloud-qualified infrastructure, HDS-certified for healthcare data hosting, and therefore sovereign, calculated in France. Benefit from high performance and optimal security for your AI applications. Your data remains strictly confidential, and is neither exploited nor stored after processing.
Chat & Reasoning
Our large models offer state-of-the-art performance for the most demanding tasks. They are particularly well-suited to applications requiring a deep understanding of language, complex reasoning or the processing of long documents.
qwen3.6:27b
gpt-oss:120b
gpt-oss:20b
llama3.3:70b
gemma3:27b
nemotron-3-super:120b
nemotron3-nano:30b
nemotron-cascade:30b
glm-4.7-flash:30b
cogito:32b
elm tree 3:32b
Elm 3:7b
qwen3-2507:235b
mistral-small3.2:24b
mistral-small4:119b
ministral-3:14b
ministral-3:8b
ministral-3:3b
qwen3.5:9b
qwen3.5:4b
qwen3.5:0.8b
qwen3:0.6b
qwen3-2507-think:4b
qwen3-omni:30b
Programming & Agents
Our programming and agent models are specially optimised for agentic software engineering, large-scale code generation and development workflow automation.
qwen3.6:35b
qwen-coder-next:80b
qwen3-next:80b
devstral-small-2:24b
rnj-1:8b
functiongemma:270m
Vision & Multimodal
Our Vision & Multimodal models can analyse images, videos and visual documents. They excel in OCR, object detection, structured extraction and spatio-temporal reasoning.
qwen3-vl:235b
qwen3-vl:32b
qwen3-vl:30b
qwen3-vl:8b
qwen3-vl:4b
qwen3-vl:2b
gemma4:31b
gemma4:e2b
gemma4:e4b
granite3.2-vision:2b
deepseek-ocr
Embedding
Our embedding models transform text into vector representations for semantic search, clustering and RAG (Retrieval-Augmented Generation) pipelines.
bge-m3:567m
qwen3-embedding:4b
qwen3-embedding:8b
qwen3-embedding:0.6b
granite-embedding:278m
embeddinggemma:300m
Reranking
Our reranking models reorder search results by relevance to refine the quality of RAG pipelines. Compatible with the Cohere API.
nvidia/llama-nemotron-rerank-vl-1b-v2
qwen3-reranker:4b
qwen3-reranker:0.6b
bge-reranker-large
Security
Our security models specialise in detecting problematic content, preventing jailbreaks and ensuring regulatory compliance (RGPD, HDS). They can be used as pre-filters or post-filters in your workflows.
granite3-guardian:8b
granite3-guardian:2b
Translation
Our translation models offer high fidelity in 55 languages, respecting the grammar, cultural nuances and technical specificities of the documents.
translategemma:27b
translategemma:12b
translategemma:4b
Audio & Image
Our Audio & Image models enable real-time voice transcription (ASR streaming) and image generation from text descriptions, compatible with the OpenAI API.
voxtral
z-image:16b
Model comparison
This comparison table will help you choose the model best suited to your needs, based on various criteria such as context size, performance and specific use cases.
| Model | Publisher | Parameters | Context (k tokens) | Vision | Agent | Reasoning | Security | Quick * | Energy efficiency * |
|---|---|---|---|---|---|---|---|---|---|
| Chat & Reasoning | |||||||||
| qwen3.6:27b | Qwen Team | 27B | 1000000 | ||||||
| gpt-oss:120b | OpenAI | 120B | 120000 | ||||||
| gpt-oss:20b | OpenAI | 20B | 120000 | ||||||
| llama3.3:70b | Meta | 70B | 132000 | ||||||
| gemma3:27b | 27B | 120000 | |||||||
| nemotron-3-super:120b | NVIDIA | 120B | 1000000 | ||||||
| nemotron3-nano:30b | NVIDIA | 30B | 1000000 | ||||||
| nemotron-cascade:30b | NVIDIA | 30B | 1000000 | ||||||
| glm-4.7-flash:30b | Zhipu AI | 30B | 120000 | ||||||
| cogito:32b | Deep Cogito | 32B | 32000 | ||||||
| elm tree 3:32b | AllenAI | 32B | 65536 | ||||||
| Elm 3:7b | AllenAI | 7B | 65536 | ||||||
| qwen3-2507:235b | Qwen Team | 235B | 200000 | ||||||
| mistral-small3.2:24b | Mistral AI | 24B | 128000 | ||||||
| mistral-small4:119b | Mistral AI | 119B | 262144 | ||||||
| ministral-3:14b | Mistral AI | 14B | 250000 | ||||||
| ministral-3:8b | Mistral AI | 8B | 250000 | ||||||
| ministral-3:3b | Mistral AI | 3B | 250000 | ||||||
| qwen3.5:9b | Qwen Team | 9B | 250000 | ||||||
| qwen3.5:4b | Qwen Team | 4B | 250000 | ||||||
| qwen3.5:0.8b | Qwen Team | 0.8B | 250000 | ||||||
| qwen3:0.6b | Qwen Team | 0.6B | 40000 | ||||||
| qwen3-2507-think:4b | Qwen Team | 4B | 250000 | ||||||
| qwen3-omni:30b | Qwen Team | 30B | 32768 | ||||||
| Programming & Agents | |||||||||
| qwen3.6:35b | Qwen Team | 35B | 1000000 | ||||||
| qwen-coder-next:80b | Qwen Team | 80B | 250000 | ||||||
| qwen3-next:80b | Qwen Team | 80B | 250000 | ||||||
| devstral-small-2:24b | Mistral AI & All Hands AI | 24B | 200000 | ||||||
| rnj-1:8b | Essential AI | 8B | 32000 | ||||||
| functiongemma:270m | 270M | 32768 | |||||||
| Vision & Multimodal | |||||||||
| qwen3-vl:235b | Qwen Team | 235B | 200000 | ||||||
| qwen3-vl:32b | Qwen Team | 32B | 250000 | ||||||
| qwen3-vl:30b | Qwen Team | 30B | 250000 | ||||||
| qwen3-vl:8b | Qwen Team | 8B | 250000 | ||||||
| qwen3-vl:4b | Qwen Team | 4B | 250000 | ||||||
| qwen3-vl:2b | Qwen Team | 2B | 250000 | ||||||
| gemma4:31b | 31B | 250000 | |||||||
| gemma4:e2b | 31B (E2B) | 128000 | |||||||
| gemma4:e4b | 31B (E4B) | 128000 | |||||||
| granite3.2-vision:2b | IBM | 2B | 16384 | ||||||
| deepseek-ocr | DeepSeek AI | 3B | 8192 | ||||||
| Embedding | |||||||||
| bge-m3:567m | BAAI | 567M | 8192 | ||||||
| qwen3-embedding:4b | Qwen Team | 4B | 40000 | ||||||
| qwen3-embedding:8b | Qwen Team | 8B | 40000 | ||||||
| qwen3-embedding:0.6b | Qwen Team | 0.6B | 32768 | ||||||
| granite-embedding:278m | IBM | 278M | 512 | ||||||
| embeddinggemma:300m | 300M | 2048 | |||||||
| Reranking | |||||||||
| nvidia/llama-nemotron-rerank-vl-1b-v2 | NVIDIA | 1B | 4096 | N.C. | |||||
| qwen3-reranker:4b | Qwen Team | 4B | 4096 | N.C. | |||||
| qwen3-reranker:0.6b | Qwen Team | 0.6B | 4096 | N.C. | |||||
| bge-reranker-large | BAAI | 335M | 512 | N.C. | |||||
| Security | |||||||||
| granite3-guardian:8b | IBM | 8B | 8192 | ||||||
| granite3-guardian:2b | IBM | 2B | 8192 | ||||||
| Translation | |||||||||
| translategemma:27b | 27B | 120000 | |||||||
| translategemma:12b | 12B | 128000 | |||||||
| translategemma:4b | 4B | 128000 | |||||||
| Audio & Image | |||||||||
| voxtral | Mistral AI | 4B | 32768 | N.C. | |||||
| z-image:16b | Community | 16B | N.C. | ||||||
Recommended use cases
Here are some common use cases and the most suitable models for each. These recommendations are based on the specific performance and capabilities of each model.
Multilingual dialogue
- nemotron-3-super:120b
- qwen3.6:27b
- nemotron3-nano:30b
- gpt-oss:120b
Analysis of long documents
- nemotron-3-super:120b
- qwen3.6:27b
- qwen3-2507:235b
Programming and development
- qwen3.6:35b
- qwen-coder-next:80b
- devstral-small-2:24b
- nemotron-3-super:120b
Visual analysis
- qwen3-vl:235b
- gemma4:31b
- deepseek-ocr
- qwen3-vl:30b
Safety and compliance
- granite3-guardian:8b
- granite3-guardian:2b
- mistral-small4:119b
Light deployments
- qwen3.5:0.8b
- qwen3-vl:2b
- ministral-3:3b
RAG (Retrieval-Augmented Generation)
- bge-m3:567m
- nvidia/llama-nemotron-rerank-vl-1b-v2
- qwen3.6:27b