Embedding Inversion: Vector Stores as a Source-Text Disclosure Surface¶
Stored embeddings can be partially inverted to reconstruct source text — a vector index is a copy of the corpus, not metadata.
Embedding inversion uses a learned decoder to map a stored vector back to its source text, reported at 92% exact recovery on 32-token inputs from text-embedding-ada-002 (Morris et al., EMNLP 2023). It is the confidentiality slice of OWASP LLM08:2025 Vector and Embedding Weaknesses — distinct from the access slice (cross-tenant retrieval) and the integrity slice (knowledge-base poisoning).
When the Threat Is Real¶
The 92% headline holds only under narrow conditions. Recovery rates degrade sharply outside the inverter's training regime (Seputis et al., 2025):
| Condition | Effect on recovery |
|---|---|
| Input length matches inverter training length (e.g. 32 tokens) | High — 92% exact match, BLEU ~83 on MSMARCO (Morris et al., 2023) |
| In-domain text (Quora, MSMARCO) | High — 95.1 BLEU on Quora (Seputis et al., 2025) |
| Out-of-domain text (SciFact) | Collapses — Token F1 9.14 (Seputis et al., 2025) |
| Off-training length | Collapses — model trained on 32 tokens fails on 128 (Seputis et al., 2025) |
| High-entropy strings (passwords) | Limited — 36% Easy, 22% Medium, 4% Hard with ada-ms-128 (Seputis et al., 2025) |
The viable threat surface is short-form, in-distribution text — chat snippets, ticket titles, support messages, names, log lines. Long-form documents in unusual domains face a far smaller attack.
Why Access Control Alone Is Not Enough¶
The access slice (multitenant-rag-authorization-gap) gates retrieval responses. The poisoning slice (rag-architecture-poisoning-robustness) gates retrieval inputs. Neither addresses the case where the index itself leaks — via a misconfigured vector database, an exposed admin API, an insider, or an intercepted replication channel.
ALGEN shows that an attacker holding a leaked index plus ~30 paired (text, embedding) samples exceeds Rouge-L 20 on reconstruction; ~1,000 samples reaches Rouge-L ~45 — with no access to the original embedding model or query API (Yin et al., 2025). Black-box post-exfiltration recovery is the realistic threat model. Encryption-at-rest does not close it: embeddings are decrypted at query time, and inversion runs on the live vector (OWASP LLM08:2025).
Why It Works¶
Dense text embeddings (≥768 dimensions) preserve fine-grained semantic and lexical signal — the property retrieval relies on. The signal that makes a vector a good retrieval key is the same signal a learned decoder uses to map it back. Morris et al. frame inversion as controlled generation: generating text that, when re-embedded, is close to a fixed point in latent space — an iterative decode-then-re-embed loop that converges whenever the embedding's fingerprint is unique enough (Morris et al., EMNLP 2023). An embedding is not a hash; it is a lossy but highly recoverable encoding.
Controls¶
Layer defenses against the realistic threat model — the index leaks and the attacker holds it offline:
| Control | What it does | Source |
|---|---|---|
| Treat the index as confidential as the corpus | Same ACLs, audit, retention as the source documents | OWASP LLM08:2025 |
| Gaussian noise on stored vectors | λ≈0.01 perturbation defeats vec2text while preserving retrieval — the cheapest practical defense | Seputis et al., 2025 |
| Never embed secrets | API keys, passwords, and high-entropy short strings face the strongest password-recovery vector; scrub before embedding | Seputis et al., 2025 |
| Projection / mutual-information defense | Heavier options like Eguard report ~95% token-level protection without major utility loss | Zhou et al., 2024 |
| Index access logging | An inversion attack starts with an index read; logging turns silent confidentiality failure into a detectable event | OWASP LLM08:2025 |
Example¶
The Gaussian-noise defense as a one-line addition to the indexing pipeline. The noise is applied at write time so the stored vectors are already perturbed when an attacker reads them.
Before — raw embeddings stored:
vectors = [embed(chunk) for chunk in chunks]
qdrant.upsert(collection="docs", points=vectors)
After — λ=0.01 Gaussian noise applied before storage:
import numpy as np
LAMBDA = 0.01 # tune against your retrieval-quality benchmarks
def noisy(v):
return v + np.random.normal(0, LAMBDA, size=v.shape)
vectors = [noisy(embed(chunk)) for chunk in chunks]
qdrant.upsert(collection="docs", points=vectors)
The noise level must be calibrated against retrieval Precision@K on the deployment workload — λ too small leaves inversion viable, λ too large degrades search (Seputis et al., 2025).
When This Backfires¶
- The index is not the attacker's path. Most teams hit cross-tenant retrieval (the access slice) long before they hit embedding inversion. The cross-tenant probe-leak rate is 98–100% on un-gated retrieval (Arceo and Narsing, 2026); inversion requires the attacker to first exfiltrate the index. Close the access slice first.
- Embedding model and corpus are out-of-distribution. Long documents, technical jargon, code, and non-English text all degrade recovery rates by an order of magnitude (Seputis et al., 2025). For these corpora, inversion is a low-priority residual risk after access and poisoning are addressed.
- Tuning Gaussian noise without measurement. λ=0.01 is the published sweet spot for one specific paper's setup. Without measuring retrieval Precision@K and inversion success on your own corpus, the parameter is guesswork — the defense can silently break retrieval or silently fail to defend (Seputis et al., 2025).
- Encryption-only is the defense. Encryption protects the index at rest but not at query time; inversion attacks operate on decrypted vectors. An encrypted-only deployment fails an LLM08 confidentiality audit and provides a false sense of safety (OWASP LLM08:2025).
Key Takeaways¶
- Vector indexes are derivative copies of the source corpus, not metadata — the LLM08 confidentiality slice that access control and poisoning defenses do not address.
- The strongest published recovery (~92% on 32-token text) holds only under narrow conditions; out-of-domain and off-training-length inputs collapse to single-digit Token F1.
- Black-box attacks (ALGEN) need only ~30 paired samples to bootstrap, so a leaked index plus any public-corpus overlap is sufficient — the attacker does not need the original embedding model.
- The cheap defense is λ≈0.01 Gaussian noise on stored vectors; tune against your own retrieval benchmarks rather than copying the published constant.
- For most teams the priority order is: close access (cross-tenant retrieval), then poisoning (KB write controls), then confidentiality (this page). Skipping the first two to harden the third is a misallocation.