Collected molecules will appear here. Add from search or explore.
Research code for quantifying and understanding uncertainty in neural abstractive summarization models, originally published at EMNLP 2020.
Defensibility
stars
30
forks
1
This project is a static research artifact from 2020. With only 30 stars and 1 fork over a 5-year period, it lacks any meaningful community adoption or maintenance. The methods described (likely focusing on token-level probability and entropy in models like BART or T5) have been largely superseded by modern LLM-based evaluation techniques, conformal prediction, and semantic entropy methods. Frontier labs like OpenAI and Anthropic are aggressively building native hallucination detection and uncertainty scoring directly into their APIs and evaluation suites (e.g., OpenAI's SimpleQA or hallucination benchmarks), rendering this 2020-era academic code obsolete. There is no moat here; the code serves as a historical reference for the paper rather than a viable tool for current production pipelines. It competes with modern frameworks like RAGAS, DeepEval, and Arize Phoenix, which offer more comprehensive and updated metrics for the LLM era.
TECH STACK
INTEGRATION
reference_implementation
READINESS