Collected molecules will appear here. Add from search or explore.
On-device data sanitization framework for robust Federated Learning (FL) alignment of Small Language Models (SLMs), specifically targeting the removal of toxic or unsafe information from private client datasets.
Defensibility
citations
0
co_authors
4
FedDetox addresses a specific niche in the LLM ecosystem: the intersection of Federated Learning (FL), Small Language Models (SLMs), and safety alignment. While most safety research focuses on intentional adversarial attacks, this project identifies 'unintended data poisoning' (natural toxicity in user data) as a primary hurdle for on-device fine-tuning. Quantitatively, the project is in its infancy (0 stars, 4 forks, 9 days old), functioning as a reference implementation for an arXiv paper rather than a production-ready tool. Its defensibility is low because the core logic—sanitizing data before gradient updates in an FL round—is a logical extension of existing FL robustness patterns and can be easily replicated by established FL frameworks like Flower or FedML. The primary threat comes from platform owners (Apple, Google) who control the OS-level integration of SLMs; if federated alignment becomes a standard feature for Siri or Android Assistant, these companies will likely implement proprietary, vertically integrated versions of this logic, leaving little room for a standalone third-party library.
TECH STACK
INTEGRATION
reference_implementation
READINESS