Deep Dive: How Weaviate Really Works Under the Hood

From filtered vector search to storage engines — a complete technical walkthrough from ingestion to retrieval


Vector databases are often treated as black boxes. You push embeddings in, fire a query, and results come back. For most use cases, that abstraction holds. But the moment you start tuning for performance, debugging unexpected recall drops, designing high-throughput ingestion pipelines, or reasoning about what happens during a crash — the black box stops being enough.

This article peels back every layer of Weaviate’s internals. We start from how filtered vector search actually works, move through how the HNSW graph is constructed and how nodes decide their connections, and go all the way down to the storage engines, the inverted index data structures, and the crash recovery mechanism sitting beneath all of it. Each section builds on the last, so by the end you’ll have a complete mental model of the entire system — from a single write call to the bytes on disk.


Filtered Vector Search — How It Actually Works

The most common misconception about Weaviate is how it handles filtered vector search. The intuitive mental model — traverse the vector index to find nearest neighbors, then filter the results — is actually the wrong model, and understanding why it fails is the starting point for understanding what Weaviate actually does.

Why Post-Filtering Breaks

HNSW (Hierarchical Navigable Small World) is a graph-based approximate nearest neighbor index. It knows nothing about your metadata filters. It only understands vector distances. So if you run HNSW first and filter afterward, you face a fundamental problem: the number of results HNSW returns is bounded by the ef parameter (typically 64–128). If your filter only matches 0.5% of your dataset, there’s no guarantee that any of those matching objects will appear in the top-64 nearest neighbors. You could traverse thousands of nodes and return zero useful results.

This isn’t a Weaviate-specific problem — it’s a fundamental tension between approximate nearest neighbor search and arbitrary metadata filtering.

Pre-Filtering with an Allowlist

Weaviate’s primary strategy inverts the order entirely. Before touching the HNSW graph, Weaviate resolves the filter by querying its inverted index — a separate index that maps scalar property values to the set of object IDs that match. This produces an allowlist: a set of IDs that are valid candidates for the final result.

Step 1: Query the inverted index
        filter: category == "electronics"
        → allowlist = {id3, id7, id19, id45, id102, ...}

Step 2: HNSW traversal with allowlist masking
        during graph traversal, when the algorithm wants
        to visit a node → check if its ID is in the allowlist
        if NOT in allowlist → skip and continue traversal

Step 3: Return top-K from the allowed nodes only

The graph topology doesn’t change. HNSW still navigates the same edges it always does. The allowlist acts as a transparent mask — nodes outside it are simply invisible to the traversal. The inverted index lookup itself is a fast key-value operation, completing in microseconds before HNSW traversal begins.

The Flat Search Fallback

Allowlist masking works well when the filter is loose — when many objects pass the filter, the masked graph is dense enough for HNSW to navigate efficiently. But when the filter is very selective, the masked graph becomes so sparse that HNSW’s navigational structure breaks down. The graph’s edges were built to reflect global vector proximity, not the topology of your filtered subset. With only 50 matching objects out of 1 million, HNSW might wander for hundreds of hops before landing on an allowed node, and its termination conditions might fire prematurely.

Weaviate handles this with an automatic flat search fallback. When the allowlist is small enough — below a configurable threshold — Weaviate skips HNSW entirely and performs a brute-force scan directly over the allowlist. With only 50 candidates, brute-force distance computation is trivially fast.

vectorIndexConfig:
  flatSearchCutoff: 40000   # default

If allowlist size < flatSearchCutoff → brute-force flat search over allowlist. If allowlist size >= flatSearchCutoff → HNSW traversal with allowlist masking.

Scenario Strategy Why
No filter Pure HNSW traversal Maximum speed
Loose filter (large allowlist) HNSW + allowlist masking Good recall, efficient traversal
Tight filter (small allowlist) Flat brute-force on allowlist Reliable when graph is too sparse

Weaviate chooses between these strategies automatically at query time. No configuration is required beyond optionally tuning flatSearchCutoff.


The HNSW Graph — Structure, Construction, and Navigation

HNSW is a layered graph structure. Every object in Weaviate’s vector index is a node in this graph, and the edges between nodes encode vector proximity — two nodes are connected if their embedding vectors are close in the embedding space.

The “hierarchical” part means the graph is organized into multiple layers, where upper layers are sparse and contain only a small fraction of all nodes, and the bottom layer (Layer 0) contains every node. The “navigable small world” part means the graph is constructed so that you can reach any node from any other node in a small number of hops — a property that enables fast approximate nearest neighbor search.

The Layered Structure

Layer 2 (very few nodes):    A ─────────────── F
                              │
Layer 1 (some nodes):        A ───── C ───── F ───── H
                              │               │
Layer 0 (ALL nodes):         A─B─C─D─E─F─G─H─I─J─K─L

Upper layers have fewer nodes and longer-range edges. They exist purely for navigation — they let you cover large swaths of vector space in a single hop, getting you close to the target region quickly. Layer 0 has every node, with dense, short-range edges that enable precise local search once you’ve landed in the right neighborhood.

A search query enters the graph at the topmost layer via a single global entry point, greedily descends through each layer finding closer and closer nodes to the query vector, and finally performs a wider beam search at Layer 0 to find the true approximate top-K.

How Layer Assignment Works

Every node is always inserted into Layer 0. But it may also be inserted into Layer 1, Layer 2, and higher — determined by a random draw at insertion time:

layer = floor( −ln(uniform(0,1)) × mL )

where  mL = 1 / ln(M)
and    M  = max connections per node (default 16)

For M=16, mL ≈ 0.36. The resulting distribution across a dataset of 1 million vectors:

Layer Expected Node Count
Layer 0 1,000,000 nodes — every object
Layer 1 ~62,500 nodes (6.25%)
Layer 2 ~3,906 nodes (0.39%)
Layer 3 ~244 nodes
Layer 4 ~15 nodes
Layer 5 ~1 node — the global entry point

The exponential thinning is the mathematical foundation of the O(log N) search complexity. Each layer reduces the navigable node population by a factor of M. A search that covers 1 million nodes at Layer 0 only needs to navigate ~15 nodes at Layer 4 before descending. The structure is essentially a probabilistic skip list generalized to high-dimensional vector space.

Inserting a New Node — Step by Step

When a new node X is inserted with an assigned max layer of 2 (for example):

Phase 1 — Navigation (layers above X’s max layer): Starting from the global entry point at the top layer, the algorithm greedily hops to whichever neighbor is closest to X at each layer, descending until it reaches Layer 2 (X’s max layer). These upper layers are used only for navigation — X is not inserted into them.

Phase 2 — Insertion (from X’s max layer down to Layer 0): At each layer from 2 down to 0, the algorithm performs a beam search with width ef_construction (default 128) to find the best candidate neighbors for X. From these candidates, it selects M connections using the neighbor selection heuristic (covered in detail below), then creates bidirectional edges between X and its selected neighbors.

Entry point update: If X’s assigned max layer is higher than the current graph’s maximum layer, X becomes the new global entry point.

Insertion of node X (assigned to max layer 2):

Top layer → navigate greedily to find landing point at Layer 2
Layer 2   → beam search ef_construction candidates → select M → connect
Layer 1   → beam search ef_construction candidates → select M → connect
Layer 0   → beam search ef_construction candidates → select M → connect

This is an online algorithm — HNSW accepts incremental inserts naturally. There is no batch rebuild step. New nodes inserted into an already-populated graph follow the same procedure and integrate seamlessly.

How Nodes Choose Their Connections — The Diversity Heuristic

The most subtle and important aspect of HNSW construction is how a node selects its M neighbors from the ef_construction candidates the beam search returns.

The naive approach — simply pick the M closest candidates — produces a badly navigable graph. If X’s true nearest neighbors are all clustered in the same region of vector space, all M connections point in the same direction. X becomes unreachable from the opposite side of the space, and search quality degrades significantly.

Naive closest-M selection:

          C1  C2  C3
            ↘  ↓  ↙
               X          ← all neighbors in the same direction
             C4  C5        X is a dead end from the right side of space
             C6  C7

Weaviate uses the SELECT-NEIGHBORS-HEURISTIC, which enforces diversity across the selected connections:

def select_neighbors_heuristic(X, candidates, M):
    result = []
    # candidates sorted by distance to X, closest first

    for candidate C in candidates:
        if len(result) >= M:
            break

        good = True
        for already_selected R in result:
            if distance(C, R) < distance(C, X):
                # C is closer to some already-selected neighbor R
                # than it is to X itself.
                # This means C covers the same angular region as R.
                # C is geometrically redundant — skip it.
                good = False
                break

        if good:
            result.append(C)

    return result

The key check: for each candidate C, if C is closer to any already-selected neighbor R than it is to X, then C is geometrically redundant — R already “covers” that direction. Skip C and move to the next candidate.

Heuristic selection:

    C1                    C9
      ↘                  ↙
           X                  ← neighbors distributed across all directions
      ↗                  ↖
    C3                    C11

Each selected neighbor covers a distinct angular region around X. This ensures that no matter which direction a search is approaching X from, there’s a connection pointing in the right direction. The graph becomes truly navigable in all directions.

Bidirectional linking and the shrink operation: After X selects its neighbors, each of those neighbors must also link back to X. If a neighbor N already has M connections (its list is full), it runs the same heuristic on its existing M connections plus X, and keeps the best M — potentially evicting one old connection. This is called the shrink operation, and it keeps the graph balanced as the dataset grows.

HNSW Configuration Parameters

Parameter What It Controls Default
M Max bidirectional connections per node per layer 16
maxConnections Max connections at Layer 0 specifically (usually 2×M) 32
ef_construction Beam width during insertion — wider = better graph quality, slower ingest 128
ef Beam width during query — wider = better recall, slower query 64
flatSearchCutoff Allowlist size threshold for flat search fallback 40,000

Higher ef_construction gives a better-connected graph and better search recall, at the cost of slower ingestion. Higher ef at query time improves recall at the cost of latency. These are the two primary knobs for the quality-performance trade-off.


The Inverted Index — Scalar Property Filtering at Scale

Weaviate maintains a separate inverted index for every scalar property on every class. This is what makes fast pre-filtering possible — when you apply a filter, Weaviate resolves it against the inverted index first, before touching the HNSW graph at all.

What an Inverted Index Is

The fundamental idea is to invert the storage model. A normal object store maps from object ID to its properties. An inverted index maps from property values to the set of object IDs that have that value:

Object Store (normal direction):
  id=1 → { category: "electronics", price: 299, in_stock: true }
  id=2 → { category: "clothing",    price: 49,  in_stock: true }
  id=3 → { category: "electronics", price: 599, in_stock: false }
  id=4 → { category: "clothing",    price: 129, in_stock: true }

Inverted Index (flipped direction):
  category:"electronics" → [1, 3]
  category:"clothing"    → [2, 4]
  price:299              → [1]
  price:49               → [2]
  price:599              → [3]
  price:129              → [4]
  in_stock:true          → [1, 2, 4]
  in_stock:false         → [3]

A filter like category == "electronics" is resolved by a direct key lookup — no scanning, no iteration. The result [1, 3] is the allowlist handed to HNSW.

For text properties, Weaviate applies a tokenization pipeline before indexing:

Raw value: "Running Shoes for Men"
     ↓
Tokenizer (configured per property: word / field / whitespace)
     ↓
["running", "shoes", "for", "men"]
     ↓
Each token indexed separately:
  "running" → [..., id_X]
  "shoes"   → [..., id_X]
  "men"     → [..., id_X]

filterableIndex: true stores the full value as a single token for exact-match filters. searchableIndex: true tokenizes for BM25 text search. Both can be enabled simultaneously on the same property.

Index Structures by Property Type

Different property types need different underlying index structures to support their query patterns efficiently:

Property Type Index Structure Why
string / text Hash map: term → roaring bitmap O(1) exact match lookup
number / int B-tree Supports range queries: price > 100 AND price < 500
boolean Two-entry map: true → [ids], false → [ids] Only two possible values
date B-tree (stored as int64 unix timestamp) Range queries over time
geo coordinates R-tree based geo index Radius queries, bounding box queries

On Ingestion

client.data_object().create({ ... })

  1. Vectorizer module converts object to embedding vector
  2. WAL append → fsync() → durability guaranteed
  3. Three parallel updates:
     │
     ├── RocksDB object store
     │     → WAL → MemTable → eventual SSTable flush
     │
     ├── RocksDB inverted index
     │     → for each scalar property:
     │       tokenize value → update roaring bitmap posting list
     │       WAL → MemTable → eventual SSTable flush
     │
     └── BoltDB HNSW graph
           → assign max layer via probability formula
           → navigate to insertion neighborhood
           → beam search ef_construction candidates
           → SELECT-NEIGHBORS-HEURISTIC → pick best M
           → bidirectional connect (shrink neighbors if full)
           → COW B+ tree write → atomic root pointer update

  4. Return 201 Created to client
client.query.get(...).with_near_vector(...).with_where(filter).do()

  1. Resolve filter → RocksDB inverted index
       → key lookup per filter term
       → roaring bitmap operations (AND / OR / NOT) for compound filters
       → result: allowlist of matching object IDs

  2. Choose search strategy:
       if len(allowlist) < flatSearchCutoff:
           → brute-force distance computation over allowlist
       else:
           → HNSW traversal with allowlist masking
               enter at global entry point (top layer)
               greedy descent through upper layers
               beam search at Layer 0 with ef candidates
               skip any node not in allowlist

  3. Return top-K results ranked by vector distance

On Crash Recovery

Process restarts after crash:

  RocksDB:
    → Read WAL file
    → Replay all records since last MemTable flush checkpoint
    → Reconstruct MemTable to pre-crash state
    → Verify SSTable checksums

  BoltDB:
    → Memory-mapped file is intact (COW never overwrote committed pages)
    → Read root page pointer → valid committed B+ tree state
    → Any in-progress write transaction that didn't commit is simply absent
    → HNSW graph is back in its last fully committed state

  Result: zero data loss for any write that received a success response

Each component’s guarantees compose cleanly. The WAL ensures RocksDB never loses an acknowledged write. BoltDB’s copy-on-write atomicity ensures the HNSW graph is always in a consistent committed state. SSTable immutability ensures compacted data is never corrupted by a partial write. Together, they give Weaviate end-to-end durability without any of the components needing to know about each other.


Why This Architecture Is the Way It Is

The design of Weaviate’s storage layer is not arbitrary. Every choice reflects a specific access pattern.

RocksDB’s LSM tree was chosen for the object store and inverted index because both absorb the full write throughput of every ingestion operation. Bulk ingest — millions of objects per hour — demands sequential write performance. LSM trees are architected for exactly this workload. The cost is slightly more complex reads, mitigated by Bloom filters and block caches.

BoltDB’s B+ tree was chosen for HNSW graph metadata because the graph is read orders of magnitude more often than it’s written. A search query reads node connection lists; an insert writes a handful of new entries. B+ trees are optimal for this read-heavy, random-access workload. The copy-on-write model also gives atomic HNSW updates for free.

Roaring Bitmaps were chosen for inverted index posting lists because filter resolution must be fast even for compound multi-property filters with millions of matching IDs. Plain arrays, hash sets, and fixed bitsets all fail in different ways. Roaring Bitmaps adapt their internal representation to the data density, giving compact storage and CPU-speed set operations regardless of how selective or non-selective the filter is.

The allowlist-first filtered search strategy was chosen because post-filtering is fundamentally unreliable for selective filters — a lesson learned across the entire approximate nearest neighbor search literature, not just Weaviate. By resolving the scalar filter first and masking the vector search, Weaviate guarantees that the top-K results are always drawn from the correct candidate pool.

No single data structure solves all of these problems. Weaviate’s architecture works because it uses the right data structure for each specific problem, and composes them into a pipeline where each component’s output is exactly what the next component needs.