Markus Hupfauer

Notes on AI security, agentic systems, and non-human identity. Personal site, personal opinions.

Automated gate opens for expected form while off-shape form slips past

Auto mode is a sensor too

The third time this year I have written down the same epistemic mistake. Auto-accept loops in agentic IDEs trust a classifier to be a control. Real bypasses already exist. The sufficiently motivated attacker problem is now sitting on your laptop, deciding which files to edit while you make coffee.

2026-05-17 · 9 min · Markus Hupfauer
Identity perimeter deflecting one vector, permitting a credentialed one

Identity is the control plane. Detection is a sensor.

Detection asks ‘is this input adversarial?’ Identity asks ‘what is this principal allowed to do, on whose behalf, right now?’ The first is probabilistic and bypassable. The second is enforceable and auditable.

2026-05-17 · 4 min · Markus Hupfauer
Data well with tripwire glyphs, one probe triggers a ripple

Salting your own well: defensive prompt injection as a tripwire

Defenders can deliberately plant content in their environments that triggers the refusal vectors of attacker-controlled agents. Against the median lazy adversary it works. Against a determined one with an abliterated model it doesn’t. Either way, it is a sensor — not a control.

2026-05-11 · 7 min · Markus Hupfauer