CORE FUNCTION

Detecting benchmark contamination in LLMs by embedding 'radioactive' watermarks into test datasets via LLM-based paraphrasing, enabling statistical detection of leakage in model outputs.

TRACTION

citations

0.0 velocity

co_authors

0.0 velocity

REASONING

The project addresses a critical 'crisis of trust' in LLM evaluation (benchmark contamination). Its defensibility is currently low (4) because the value lies not in the code complexity, but in the potential for this specific watermarking protocol to be adopted as a standard by benchmark creators. With 0 stars and 5 forks, it currently lacks the network effects needed to be a 'moat.' The concept of 'Radioactive Data' was pioneered by Meta AI researchers; this project applies that concept specifically to benchmarks. The primary risk is platform domination: if Hugging Face or a major evaluation harness (like LM Eval Harness) integrates a similar watermarking/detection scheme natively, this standalone implementation becomes obsolete. Frontier labs are unlikely to build this themselves (as they are the ones being tested), but third-party audit firms and evaluation platforms are the natural owners of this space. The 410-day age suggests it hasn't yet achieved viral adoption in the developer community, remaining primarily a research artifact.

COMPOSABILITY

TECH STACK

pythonpytorchtransformersstatistical_hypothesis_testing

INTEGRATION

reference_implementation

benchmark_decontaminationwatermark_detectiondata_provenancellm_evaluation

READINESS

Composabilityalgorithm

Depthreference_implementation

Noveltynovel_combination