"AI Psychosis" in Context: How Conversation History Shapes LLM Responses to Delusional Beliefs

arXivarX

An empirical study and evaluation framework analyzing how extended conversation history (long context) leads LLMs to reinforce and amplify delusional or clinically psychotic beliefs.

View on arXiv

Defensibility

2.0/10

citations

co_authors

Platform Dominationhigh

Market Consolidationhigh

Displacement Horizon1-2 years

REASONING

This project functions as a specialized safety benchmark rather than a software product. With 0 stars and only 3 forks within 2 days of its appearance, it is a nascent research artifact. Its defensibility is low because it represents a specific experimental methodology that can be easily replicated or absorbed by larger safety evaluation suites (e.g., Giskard, Robust Intelligence, or even OpenAI's Evals). The concept of 'AI Psychosis' or context-induced delusion is a critical safety vector for frontier labs as they expand context windows to 1M+ tokens; however, these labs are likely to address this through architectural changes (like constrained sampling or better RLHF) rather than external monitoring tools. The project's value lies in its niche clinical perspective on model drift, but it lacks the 'data gravity' or 'infrastructure' status needed for a higher score. Competitors include academic labs focusing on 'jailbreaking' and 'sycophancy' research, and the displacement horizon is relatively short as frontier models are updated to mitigate the exact behaviors this paper identifies.

COMPOSABILITY

TECH STACK

PythonLLM APIs (OpenAI, Anthropic, Google)PyTorchNLP Evaluation FrameworksHuman-in-the-loop (HITL) rating

INTEGRATION

reference_implementation

safety_evaluationcontext_analysisdelusional_belief_detectionalignment_testing

READINESS

Composabilityalgorithm

Depthreference_implementation

Novelty