Embedding models
base Active

Default model. Powered by mixedbread-ai/mxbai-embed-large-v1 on ONNX runtime. Suitable for semantic search, classification, clustering, and RAG.

1024
Dimensions
512
Tokens / chunk
512 – 4,096
Max tokens (plan)
Token limits by plan
Detail
Free
Paid
Max tokens per request
512
4,096
Max chunks (512 tok each)
1
8
Approx. words embedded
~380
~3,000
Typical latency (cold)
~1–2s
~5–15s
Note
Paid teams can pass max_tokens per request (512–4,096) to trade quality for speed. Free teams are fixed at 512 tokens (single inference call).
Long text
Auto-split into paragraphs, embed individually, combine via token-weighted average
Query prefix
Automatic search-optimised prefix for input_type "query"
Preprocessing
Whitespace normalisation, case folding, punctuation removal
Caching
Per-paragraph Redis cache (24h TTL). Repeated text and re-embedded documents with unchanged paragraphs resolve from cache
enhanced Coming soon

AI-augmented pipeline. A fast language model preprocesses text to extract, expand, and enrich searchable content before embedding. Query-side expansion improves recall for short or ambiguous queries.

1024
Dimensions
TBD
Latency
TBD
Accuracy gain
Document side
Keyword extraction, concept expansion, synonym injection
Query side
Short query expansion into richer semantic representations
Compatibility
Same 1024-dim output. Drop-in replacement for base
Shielded mode
shielded: true Active

Property-preserving encryption applied to any model's output. Based on the SAP algorithm (Fuchsbauer et al. 2022). Approximate nearest-neighbour search is preserved while stored embeddings are protected at rest against database leaks and cross-team access.

Pipeline
01
Normalise
Unit length
02
Scale
Per-team secret factor
03
Perturb
Bounded ball noise
04
Transform
Sign flips + dimension permutation
Performance
Metric
Unshielded
Shielded
Added latency
<1ms
Output dimensions
1024
1024
Near-identical cosine
~0.99
~0.94
Distant cosine
~0.05
~0.05
Ranking preserved
Yes
Yes
Reversible
Yes
No*
Constraints
Shielded embeddings from different teams cannot be compared — each team has unique encryption parameters derived from an independent cryptographic salt.
Rotating the server secret key invalidates all existing shielded embeddings. Documents must be re-embedded.
Deterministic: the same text and team always produce the same shielded vector. This enables deduplication but means low-entropy inputs (e.g. structured codes, short phrases from a known list) may be vulnerable to dictionary matching by same-team key holders.
*Reversibility: shielded vectors cannot be reversed to recover input text. An attacker who compromises both the server secret key and the per-team database salt can approximately recover the original embedding vector (not the text), so both secrets must be protected.
Ranking is preserved when the distance gap between candidates exceeds the noise parameter (beta). Very close distances may have their ordering perturbed.
Threat model
Scenario
Unshielded
Shielded
Database leak — attacker obtains stored vectors
Exposed
Protected
Vector DB compromised (Pinecone, Qdrant, etc.)
Exposed
Protected
Cross-team — team A's vectors used to decode team B
N/A
Protected
DBA / admin with database access but no API key
Exposed
Protected
External attacker with own API key, no access to target team
Exposed
Protected
Compromised API key holder within the same team
Exposed
Exposed
Note
Shielded mode protects stored embeddings at rest. It does not protect against an attacker who holds a valid API key for the same team — they can embed known texts and compare against stolen shielded vectors (dictionary attack). If this scenario is in your threat model, revoke compromised keys immediately.
Defense-in-depth: reversing shielded embeddings requires both the server secret key and the per-team cryptographic salt (stored in the database). Compromising either one alone is insufficient.