GraphRAG-Bench/GraphRAG-Benchmark

GitHubGH

Standardized benchmarking framework for evaluating Graph Retrieval-Augmented Generation (GraphRAG) systems across various reasoning tasks and dataset types.

View on GitHub

Defensibility

5.0/10

stars

390

forks

Platform Dominationmedium

Market Consolidationmedium

Displacement Horizon1-2 years

REASONING

GraphRAG-Bench addresses a critical gap in the RAG ecosystem: determining exactly where graph structures provide ROI over traditional vector-based RAG. With 390 stars and 49 forks, it has established a foothold as a credible research artifact, specifically linked to an ICLR 2026 submission. Its defensibility stems from the 'curation moat'—the specific datasets and task formulations required to test graph-traversal capabilities (multi-hop reasoning, global summarization). However, it faces stiff competition from industry-standard evaluation frameworks like RAGAS, which are rapidly expanding into graph-specific metrics, and from Microsoft's own GraphRAG implementation which includes internal evaluation suites. The project is currently a reference implementation for a paper; its long-term survival depends on whether it evolves into a living library (like MTEB) or remains a static research snapshot. The 'medium' frontier risk reflects that while labs like OpenAI and Anthropic are building better RAG, they rarely focus on building the evaluation benchmarks themselves, preferring to let the community define the standards they then optimize against.

COMPOSABILITY

TECH STACK

pythonnetworkxopenai-apihuggingface-transformersjsonlines

INTEGRATION

reference_implementation

graph_rag_evaluationbenchmarkingretrieval_analysisllm_performance_metrics

READINESS

Composabilityframework

Depthreference_implementation

Noveltynovel_combination