Laworigin/DoWhat

GitHubGH

Automated activity logging and task management through periodic screen capture and VLM-based visual analysis of user workflows.

View on GitHub

Defensibility

2.0/10

stars

Platform Dominationhigh

Market Consolidationmedium

Displacement Horizon6 months

REASONING

DoWhat is an early-stage prototype (1 star, 18 days old) that implements the 'AI Screen Observer' pattern. While the privacy-first, local-first approach is a valid niche, the project currently lacks a technical moat or significant community traction. It builds on OpenClaw, suggesting it is more of an assembly of existing frameworks than a novel engine. Competitive Landscape: The project faces immense pressure from both well-funded startups like Limitless (formerly Rewind) and OpenAdapt, as well as OS-level integrations. Microsoft Recall (despite its PR hurdles) and Apple Intelligence are moving directly into this 'screen-aware' agent space. Defensibility: The score is low (2) because the core loop—periodic screenshots fed into a VLM for intent extraction—has become a standard design pattern in the 'Computer Use' agent category. Without a massive dataset of user interactions or a proprietary high-speed local vision model, it is easily reproducible. Frontier Risk: High. OpenAI (via 'Operator' and desktop apps) and Anthropic (via 'Computer Use') are aggressively optimizing models for exactly this type of interface interaction. As these labs provide more efficient 'vision-to-action' tokens, the value of thin wrapper applications like DoWhat diminishes unless they offer deep vertical integration into specific work tools (e.g., Jira, Slack, GitHub).

COMPOSABILITY

TECH STACK

PythonOpenClawVision Language Models (VLM)Desktop Automation APIsLocal Database

INTEGRATION

application

screen_understandingactivity_trackingtask_automationagentic_workflow

READINESS

Composabilityapplication

Depthprototype

Noveltyreimplementation