Models — PolyEmbed

Embedding models

base Active

Default model. Powered by mixedbread-ai/mxbai-embed-large-v1 on ONNX runtime. Suitable for semantic search, classification, clustering, and RAG.

1024

Dimensions

512

Tokens / chunk

512 – 4,096

Max tokens (plan)

Token limits by plan

Detail

Free

Paid

Max tokens per request

512

4,096

Max chunks (512 tok each)

Approx. words embedded

~380

~3,000

Typical latency (cold)

~1–2s

~5–15s

Note

Paid teams can pass max_tokens per request (512–4,096) to trade quality for speed. Free teams are fixed at 512 tokens (single inference call).

Long text

Auto-split into paragraphs, embed individually, combine via token-weighted average

Query prefix

Automatic search-optimised prefix for input_type "query"

Preprocessing

Whitespace normalisation, case folding, punctuation removal

Caching

Per-paragraph Redis cache (24h TTL). Repeated text and re-embedded documents with unchanged paragraphs resolve from cache

enhanced Coming soon

AI-augmented pipeline. A fast language model preprocesses text to extract, expand, and enrich searchable content before embedding. Query-side expansion improves recall for short or ambiguous queries.

1024

Dimensions

TBD

Latency

TBD

Accuracy gain

Document side

Keyword extraction, concept expansion, synonym injection

Query side

Short query expansion into richer semantic representations

Compatibility

Same 1024-dim output. Drop-in replacement for base

Shielded mode

shielded: true Active

Property-preserving encryption applied to any model's output. Based on the SAP algorithm (Fuchsbauer et al. 2022). Approximate nearest-neighbour search is preserved while stored embeddings are protected at rest against database leaks and cross-team access.

Pipeline

Normalise

Unit length

Scale

Per-team secret factor

Perturb

Bounded ball noise

Transform

Sign flips + dimension permutation

Performance

Metric

Unshielded

Shielded

Added latency

—

<1ms

Output dimensions

1024

Near-identical cosine

~0.99

~0.94

Distant cosine

~0.05

Ranking preserved

Yes

Reversible

Yes

No*

Constraints

Shielded embeddings from different teams cannot be compared — each team has unique encryption parameters derived from an independent cryptographic salt.

Rotating the server secret key invalidates all existing shielded embeddings. Documents must be re-embedded.

Deterministic: the same text and team always produce the same shielded vector. This enables deduplication but means low-entropy inputs (e.g. structured codes, short phrases from a known list) may be vulnerable to dictionary matching by same-team key holders.

*Reversibility: shielded vectors cannot be reversed to recover input text. An attacker who compromises both the server secret key and the per-team database salt can approximately recover the original embedding vector (not the text), so both secrets must be protected.

Ranking is preserved when the distance gap between candidates exceeds the noise parameter (beta). Very close distances may have their ordering perturbed.

Threat model

Scenario

Unshielded

Shielded

Database leak — attacker obtains stored vectors

Exposed

Protected

Vector DB compromised (Pinecone, Qdrant, etc.)

Exposed

Protected

Cross-team — team A's vectors used to decode team B

N/A

Protected

DBA / admin with database access but no API key

Exposed

Protected

External attacker with own API key, no access to target team

Exposed

Protected

Compromised API key holder within the same team

Exposed

Note

Shielded mode protects stored embeddings at rest. It does not protect against an attacker who holds a valid API key for the same team — they can embed known texts and compare against stolen shielded vectors (dictionary attack). If this scenario is in your threat model, revoke compromised keys immediately.

Defense-in-depth: reversing shielded embeddings requires both the server secret key and the per-team cryptographic salt (stored in the database). Compromising either one alone is insufficient.