Collected molecules will appear here. Add from search or explore.
Detecting benchmark contamination in LLMs by embedding 'radioactive' watermarks into test datasets via LLM-based paraphrasing, enabling statistical detection of leakage in model outputs.
citations
0
co_authors
5
The project addresses a critical 'crisis of trust' in LLM evaluation (benchmark contamination). Its defensibility is currently low (4) because the value lies not in the code complexity, but in the potential for this specific watermarking protocol to be adopted as a standard by benchmark creators. With 0 stars and 5 forks, it currently lacks the network effects needed to be a 'moat.' The concept of 'Radioactive Data' was pioneered by Meta AI researchers; this project applies that concept specifically to benchmarks. The primary risk is platform domination: if Hugging Face or a major evaluation harness (like LM Eval Harness) integrates a similar watermarking/detection scheme natively, this standalone implementation becomes obsolete. Frontier labs are unlikely to build this themselves (as they are the ones being tested), but third-party audit firms and evaluation platforms are the natural owners of this space. The 410-day age suggests it hasn't yet achieved viral adoption in the developer community, remaining primarily a research artifact.
TECH STACK
INTEGRATION
reference_implementation
READINESS