Open Research
We publish practical, system-level AI research focused on real-world performance, reliability, and production outcomes — from architecture to deployment behavior.
Hardened Shell: Securing LLM Agents Against OpenClaw Vulnerabilities
Authors: Dezso Mezo, Joran Bjarne van Beek
This paper investigates critical security failures in tool-using agent architectures. We present a defense-in-depth framework focused on predictable execution, tool-injection resistance, and governance enforcement under real-world constraints.
“Research is only useful if it becomes deployable — measurable, repeatable, and grounded in real constraints.”
What We Study
We investigate practical AI system design: how models behave under real data, real latency, real cost limits, and real user expectations — then translate that into production-ready patterns.
Input Sanitization
Prevent malicious or malformed inputs from ever reaching model context. We test filters, parsers, and schema guards that reduce injection risk and improve output stability.
Strict QA Guardrails
Verification loops for every step: assertions, tool output validation, and deterministic checks that stop unsafe execution before it propagates into production.
Output Sandboxing
We isolate model outputs from critical systems, enforce permissions, and constrain tool execution so a single bad response cannot escalate into system-level damage.
Production Evaluation
We measure what actually matters: failure modes, drift, latency, and cost. Then we tune pipelines so results stay reliable after deployment.
Global Impact
We share frameworks and findings that help teams ship better AI — faster iteration, clearer evaluation, and more predictable performance in production.
If you want to collaborate on applied AI research, reach out.