Most teams do not need a knowledge graph. Vector search fails on exactly two kinds of question: set intersection (matching both A and B) and hierarchy or path traversal (walking the chain from here to there). A graph fixes those, and only those. If your queries are not relational, GraphRAG adds extraction cost and maintenance for a capability you will rarely use. Build one only when query patterns are genuinely multi-hop, and even then as a complement to vector search.
This piece targets intermediate React, Vue, and TypeScript developers running a vector RAG that already works for document-lookup questions and falls over on relational ones. As of June 2026, Microsoft's open-source GraphRAG is well past its 1.0 milestone (the repository sits on the 3.x line), and a wave of lower-cost variants such as LazyGraphRAG, LightRAG, and Fast GraphRAG has changed the build-versus-skip math. The code here is illustrative TypeScript, kept lighter than usual, because the decision matters more than any one library's API.
TL;DR: do you need a knowledge graph or not?
Stay on vector search with reranking if your queries are overwhelmingly "what does this say about X." That covers most support bots, documentation assistants, and search-over-notes products, and a graph buys you nothing there. Build a knowledge graph only when a real share of your queries are relational: set intersection, hierarchy traversal, or multi-hop path questions that vector similarity structurally cannot answer. When you do build one, run it as hybrid retrieval (route relational queries to the graph, everything else to vectors, and boost chunks linked to graph-retrieved entities) rather than ripping out the vector index. And before you commit to a full Microsoft GraphRAG build, evaluate the cheaper 2026 variants, because the indexing bill is the part teams underestimate. The honest default is to skip it.
Why vector search returns confident wrong answers on relational questions
The failure that sends people looking at graphs is not an error message. It is a fluent, plausible answer that happens to be wrong, and that is worse, because nobody notices until a user does.
A vector RAG embeds each chunk of your corpus into a high-dimensional vector, embeds the query the same way, and returns the chunks whose vectors sit closest to the query's. That is similarity, and similarity is genuinely good at "find me passages about late-stage clinical trials" or "summarize what this contract says about termination." The retriever finds text that resembles the question and the model reads it back to you.
Now ask a relational question against the same corpus: "which of our vendors are certified for both SOC 2 and HIPAA?" The embedding for that query is near chunks that mention vendors, SOC 2, and HIPAA. So the retriever happily returns a chunk about a vendor with SOC 2, and another chunk about a different vendor with HIPAA, and the model stitches them into an answer that reads correct and satisfies neither condition. Nothing in the pipeline ever computed the intersection, because similarity is not intersection. The retriever cannot answer a question about the relationship between entities, only about resemblance to text.
That is the gap. It is narrow, but inside it live some of the highest-value questions a business asks.
The questions vector search structurally cannot answer
It helps to be precise about which questions break, because the list is short and a graph is only justified by what is on it. Two patterns matter.
Set intersection ("both A and B")
Any query that asks for the entity satisfying multiple independent conditions is set intersection, and a similarity ranker cannot do it. "Suppliers who ship to the EU and offer net-60 terms." "Engineers who know Rust and have shipped on the payments team." "Papers that cite both Smith and Jones." Each condition lives in different chunks, often in different documents, and the answer is the entity where the conditions overlap.
Vector search ranks each chunk by how close it sits to the blended query embedding. A chunk that strongly satisfies one condition can outrank a chunk that weakly satisfies both, so the true intersection often does not even make the top results. You can paper over a single hard case by stuffing more chunks into the context window and asking the model to filter, but that scales badly: the cost climbs, the recall stays unreliable, and you are paying a large model to do a join that a graph does in one traversal.
Hierarchy and path traversal
The second pattern is anything that requires following a chain of relationships. "Who is two levels up from this engineer in the org chart?" "What depends on this microservice, and what do those depend on?" "Trace the ownership from this subsidiary to the ultimate parent company." These are path queries, and the answer is defined by the edges between entities, not by any single passage.
Vector search has no notion of an edge. It can retrieve the chunk that says "Service A calls Service B" and the chunk that says "Service B calls Service C," but it cannot compose them into "A transitively depends on C" unless that exact sentence happens to exist in your text. Multi-hop reasoning over similarity is a series of lucky retrievals, and luck is not a retrieval strategy. A graph walks the path directly: start at a node, follow the edges, return what you land on.
If neither of these patterns describes your actual query log, the rest of this article is permission to stop reading and keep your vector index. The graph fixes set intersection and traversal. If you are not doing those, it fixes nothing.
What a knowledge graph actually adds: nodes, edges, traversal
A knowledge graph is three things: nodes (the entities, like a vendor, a person, a service), edges (the typed relationships between them, like certified-for, reports-to, depends-on), and the ability to traverse those edges with a query. That last part is the whole point. The structure is not the value; the traversal is.
GraphRAG, in the Microsoft sense, builds that graph for you with an LLM. It reads your text in units, extracts the entities and the relationships and key claims, then clusters the graph into communities using the Leiden algorithm and writes a summary of each community from the bottom up. According to the GraphRAG documentation, that hierarchical community structure is what lets it answer holistic, whole-corpus questions like "what are the main themes across all of these documents," which a chunk-level retriever struggles with because no single chunk contains the theme.
So a graph gives you two capabilities a vector index does not have. It answers relational and multi-hop queries by traversal, and it answers global sense-making queries through community summaries. Both are real. Neither is free.
The thing to hold onto is that a graph is not a better vector index. It is a different data structure that answers a different question. Swapping one for the other is a category change, not an upgrade, which is why "we'll just add GraphRAG" is the wrong mental model and "we'll add a graph for the relational queries and keep vectors for the rest" is the right one.
The real cost: extraction, schema, entity resolution, and maintenance
Here is the part the architecture diagrams leave out, and the part that decides the build-versus-skip call. The cost of a knowledge graph is not the database. It is everything you do before a single query runs.
The first cost is extraction. Turning unstructured text into entities and edges means running an LLM over your entire corpus, often multiple passes per chunk. Microsoft's own repository carries a blunt warning: GraphRAG indexing "can be an expensive operation," and tells you to read the docs and "start small" before you point it at real data (microsoft/graphrag README). That same README also states the project "is a demonstration and is not an officially supported Microsoft offering," which is worth weighing before you put it on the critical path of a paying product. Re-indexing as your corpus grows means paying that extraction bill again.
The second cost is schema. A graph needs to know what an entity is and what relationships are legal between entities. Left fully open, extraction invents a sprawl of near-duplicate relationship types ("works-with," "collaborates-with," "partners-with") that make traversal unreliable. Constraining it means designing an ontology, which is real upfront modeling work and the kind of thing that quietly turns a two-week spike into a quarter.
The third cost, and the one that bites in production, is entity resolution. The same real-world thing shows up in your text as "Acme Corp," "Acme Corporation," "ACME," and "Acme Inc.," and unless you collapse those into one node, your graph has four disconnected vendors and your set-intersection query returns nothing. Deduplication across spelling, abbreviation, and context is a hard problem on its own, and it never finishes, because every new document can introduce a new alias for an entity you already have.
The fourth cost is maintenance. A vector index handles a new document by embedding it and adding it. A graph has to extract the new entities, resolve them against what already exists, add the new edges, and, for Microsoft GraphRAG specifically, potentially recompute community summaries. That update path is exactly where the cheaper variants compete, and we will get to them.
None of this is a reason a graph is bad. It is a reason a graph is expensive, and expense you only recover if the relational queries are frequent and valuable. The same hardening instinct from hardening an AI-generated React app for production applies here: the demo is the easy 20 percent, and the extraction, resolution, and maintenance are the 80 percent that decides whether it survives contact with real data.
Hybrid retrieval: combining graph traversal with vector search
If you have concluded the relational queries are worth it, the next mistake to avoid is throwing away vector search. The two structures answer different questions, so the strong design uses both: vectors for passage-level recall, the graph for structure. This is hybrid retrieval, and for almost every team that adopts a graph, it is the right shape.
The architecture has two moving parts. First, decide which retrieval path a given query needs. Second, when the graph is involved, use the entities it returns to boost the vector results that are tied to those entities, so structure and similarity reinforce each other rather than competing.
Classifying which queries get graph retrieval
Routing comes first, because most queries should never touch the graph. A cheap, fast classifier looks at the query and decides whether it has a relational component at all. You can start with a small model or even heuristics and only reach for something heavier if the routing accuracy is not good enough.
src/ai/route-query.ts
export type RetrievalRoute = 'vector' | 'graph' | 'hybrid'
export interface QueryClassification {
route: RetrievalRoute
reason: string
}
// Conditions in the same query ("both A and B") and relationship words
// ("reports to", "depends on", "connected to") signal a relational query.
const RELATIONAL_HINTS = [
/\bboth\b.*\band\b/i,
/\breport(s|ing)?\s+to\b/i,
/\bdepend(s|ent)?\s+on\b/i,
/\bconnected\s+to\b/i,
/\bowns?\b|\bowned\s+by\b/i,
/\bwho\s+(manages|leads)\b/i,
]
export function classifyQuery(query: string): QueryClassification {
const isRelational = RELATIONAL_HINTS.some((pattern) => pattern.test(query))
if (!isRelational) {
return { route: 'vector', reason: 'No relational or multi-hop component detected' }
}
// Relational queries usually still want supporting passages, so default to hybrid.
return { route: 'hybrid', reason: 'Relational signal found; traverse the graph and back it with vectors' }
}
The point of this code is not the regex list, which is a starting heuristic you will outgrow. The point is the decision it encodes: relational signals route to the graph, and everything else goes straight to vector search and never pays the graph's query cost. In production you would back this with a tiny classification model and a confidence threshold, but the routing rule is what matters. A query without a relational component has no business traversing a graph, and a router that sends every query to the graph throws away the cost savings that justified building one. This is the same routing-by-intent idea behind the provider boundary in building an LLM fallback layer before your model vanishes: the call site states intent, and a boundary decides how to satisfy it.
Boosting chunks linked to graph-retrieved entities
When a query is hybrid, the graph and the vector index each return candidates, and you have to merge them. The merge that works is not concatenation. It is using the entities the graph traversal returned to lift the score of vector chunks associated with those same entities, so passages about the right entities rise even if their raw similarity was middling.
src/ai/hybrid-retrieve.ts
export interface Chunk {
id: string
text: string
score: number // raw vector similarity, 0..1
entityIds: string[] // entities this chunk was tagged with at index time
}
export interface HybridOptions {
// How much to reward a chunk for mentioning a graph-retrieved entity.
boost: number
}
export function mergeWithGraphBoost(
vectorChunks: Chunk[],
graphEntityIds: string[],
options: HybridOptions,
): Chunk[] {
const relevant = new Set(graphEntityIds)
return vectorChunks
.map((chunk) => {
const hits = chunk.entityIds.filter((id) => relevant.has(id)).length
const boosted = chunk.score + hits * options.boost
return { ...chunk, score: boosted }
})
.sort((a, b) => b.score - a.score)
}
The boost is a single tunable number, and that is deliberate. Set it to zero and you have pure vector search, which is your safe baseline. Turn it up and chunks tied to the entities your traversal surfaced climb the ranking, so the answer about "vendors certified for both SOC 2 and HIPAA" pulls the supporting passages about those specific vendors to the top instead of generic compliance text. The reason this beats dumping both result sets into the prompt is that it produces one ranked list with a defensible order, and you can tune the boost against an evaluation set rather than hoping the model sorts it out. Start the boost low, measure, and raise it only if relational answers improve without drowning ordinary lookups.
Cheaper paths in 2026: LazyGraphRAG, LightRAG, and Fast GraphRAG
Before you sign up for full Microsoft GraphRAG indexing, look hard at the variants that target the exact cost we walked through above. The whole reason these exist is that the expensive part of GraphRAG is the upfront LLM work, and each of these attacks it differently. Treat the numbers below as the authors' published benchmarks, not as something measured on your corpus.
LazyGraphRAG, from Microsoft Research, defers all LLM use to query time. At indexing it uses only noun-phrase extraction and graph statistics, skipping the entity summaries and community summaries that make standard GraphRAG slow to build. Microsoft Research reports that LazyGraphRAG's indexing cost is "identical to vector RAG and 0.1% of the costs of full GraphRAG," and claims comparable answer quality to GraphRAG global search at "more than 700 times lower query cost" for global queries. If those hold for your data, the build-versus-skip calculus shifts hard, because you are paying roughly vector-search indexing prices for graph-style answers.
LightRAG is an open-source approach that pairs a graph with a vector index and, critically, supports incremental updates without rebuilding the whole community structure. The LightRAG paper reports that retrieval can take around 100 tokens against figures in the hundreds of thousands for GraphRAG, a difference the authors put near 6,000x on their benchmark. The maintenance angle is the real draw: if your corpus changes often, an approach that adds a document without reprocessing everything is worth a lot.
Fast GraphRAG, from Circlemind, leans on PageRank-style graph exploration and positions itself as a lighter, adaptive build. Its maintainers report roughly a 6x cost saving versus standard GraphRAG that grows with data size and the number of insertions. It is the smallest conceptual leap from "I want a graph" to "I have a graph" without the full Microsoft pipeline.
The pattern across all three is the same: standard GraphRAG front-loads expensive LLM work into indexing, and the cheaper variants either defer it to query time or do less of it. For a team testing whether a graph helps at all, starting with one of these de-risks the experiment, because you find out if relational retrieval moves your metrics before you have paid for a full build.
A decision framework: when the graph is worth it
Strip away the tooling and the decision comes down to your query log and your tolerance for wrong relational answers. Pull a representative sample of real user queries and label each one: is it a document-lookup question, or does it require set intersection, hierarchy, or multi-hop traversal? That single count decides most of this.
The comparison below is the version I would put in front of a team arguing about it.
| Vector search + reranking | Hybrid (graph + vector) | Full Microsoft GraphRAG | |
|---|---|---|---|
| Best at | "What does this say about X" lookups | Mixed: lookups plus relational queries | Relational queries plus whole-corpus sense-making |
| Set intersection ("both A and B") | Fails | Handles via traversal | Handles via traversal |
| Hierarchy / path traversal | Fails | Handles via traversal | Handles via traversal |
| Indexing cost | Lowest | Medium (graph build on top of embeddings) | Highest (entity, relationship, and community summarization) |
| Maintenance on new docs | Embed and add | Extract, resolve, add edges | Extract, resolve, recompute community summaries |
| When to use | Queries are overwhelmingly lookups | A real share of queries are relational | Relational plus broad thematic queries over a stable corpus |
Read it top to bottom and the rule falls out. If your sample is almost all lookups, the left column wins and a graph is wasted money. If a meaningful slice is relational, the middle column is the pragmatic answer, because it adds traversal without surrendering similarity. The right column earns its cost only when you have both relational queries and genuine whole-corpus sense-making needs over a corpus stable enough that you are not paying the community-summarization bill every week.
Three conditions need to hold together before a full build pays off. A meaningful share of queries must be relational, not a handful of demo questions. The corpus must have stable, well-defined entities worth modeling, because resolution on messy entities will eat your timeline. And a wrong relational answer must carry a real cost, because if nobody is harmed when the intersection is slightly off, the reliability you are buying is not worth its price. Miss any one of those and you are building infrastructure to answer questions you do not really have.
When to stay with vector search plus reranking
For most teams reading this, the recommendation is the unglamorous one: do not build the graph. Add a reranker to your existing vector pipeline and move on.
The reason is that a lot of "vector search is failing us" turns out to be retrieval-quality failure, not a structural one. If the right chunk is in your index but ranked sixth, that is a reranking problem, and a cross-encoder reranker that rescores the top candidates against the query fixes it for a fraction of a graph's cost and complexity. Many teams jump from "the answers are mediocre" straight to "we need a knowledge graph," skipping the cheap fix that would have solved it. Before you model a single entity, confirm the failure is genuinely relational and not a top-k that needs reordering.
Vector search with reranking is also far less to operate. There is no ontology to maintain, no entity resolution drifting as new aliases appear, and no community summaries to recompute. You embed, you retrieve, you rerank, you answer. That operational simplicity has real value, and it is the same instinct behind not over-engineering a React data layer before you have the problem that justifies it, the trap catalogued in AI-generated React code that fails in production. Reach for the heavier structure when the queries demand it, not because a graph is the impressive answer.
There is one honest caveat. If you have profiled your queries, confirmed a real relational workload, and a reranker still cannot answer "both A and B" or "trace the path from here to there," then you have found the genuine case for a graph, and you should build one. Start with a cheaper variant and a hybrid design, and prove the relational queries improve before you commit to the full pipeline.
Conclusion
The flood of GraphRAG content in 2026 quietly assumes you have the problem it solves. Most teams do not. A knowledge graph is the right tool for set intersection and path traversal, and it is dead weight for the document-lookup questions that make up the bulk of real RAG traffic. The discipline is to count your relational queries before you build anything, because that number, not the elegance of the architecture, is what decides the call.
If your queries are relational enough to justify a graph, build the smallest thing that proves it: a cheaper variant like LazyGraphRAG or LightRAG, wired as hybrid retrieval that routes by query type and boosts vector chunks tied to graph entities, so you keep similarity search and add traversal on top. If they are not, put a reranker on your vector index and spend the saved quarter on something your users will actually feel. The next relational question is coming whether you modeled for it or not. Make sure it is the question you built for, and not the one you imagined.


