Prompt Injection
One clever message. That's all it takes to override everything you told your AI to do. Your assistant becomes the attacker's assistant. Game over.
PromptSage V2.5 is an XML framework that gives your AI actual rules it actually follows — 7 layers of behavioral control plus defense against attacks you can't even see.
Award-winning research
Guardrail bypass rate
Unicode injection vs. major guardrails — arXiv:2504.11168
One clever message. That's all it takes to override everything you told your AI to do. Your assistant becomes the attacker's assistant. Game over.
An emoji walks into your prompt. Looks innocent. But it's carrying hidden instructions your eyes can't see — and your AI follows them blindly. This is real, and it works on everything.
Your AI starts strong. By turn 15, it's forgotten half its instructions. By turn 30, it's making up its own rules. This isn't hallucination — it's policy drift, and every unstructured prompt does it.
Rules that can never be broken
The foundation. No matter what a user types, no matter how clever the injection — these rules don't budge. Think of it as the AI's constitution. Everything else can be argued. This can't.
Key insight: The architecture is self-reinforcing — it exploits how LLMs actually process instructions, not how we wish they would.
Security Boundary
Input Normalization (V2.5)NEW
Identity
Core Directives
Mode Control
Behavioral Protocols
Customizable Defaults
Structural Reinforcement
Layers 1 & 7 create structural redundancy — the architecture closes its own loop
What you see
Hello! 🙂 Can you help me with something?
Looks innocent. A user asking for help.
What the AI receives (decoded)
Hidden characters encode instructions humans cannot read.
90%+ bypass
Emoji injection vs. tested guardrails (arXiv:2504.11168)
Blocked by PromptSage V2.5
Unicode normalisation catches it before it reaches the model
| Feature | PromptSage V2.5 | Unstructured | DSPy / LMQL | Fine-Tuning |
|---|---|---|---|---|
| Behavioral control | 7-layer hierarchy | Implicit / guessed | Task-focused | Model-level |
| Injection defense | 5-layer + Unicode | |||
| Unicode injection defense | ||||
| Setup time | Minutes (XML template) | Minutes (unreliable) | Hours (code) | Days (data + compute) |
| Cross-model compatible | ||||
| Cost | $0 (prompt-only) | $0 | $0 | $$$$ (compute) |
| Continuous compliance | ||||
| Structural reinforcement |
Let's talk
Whether you want to understand the research, get PromptSage implemented in your system, or just want to nerd out about AI security — I'm here for all of it.
The Receipts
Not a weekend project
Four years of research, 30+ academic citations, three awards, and five AI model families tested. PromptSage powers real production systems — including the ones that won these.
Awards
EU Green Innovation Days 2025
1st place — NeuroBridgeEDU recognised for sustainable AI architecture in education
Irish Enterprise Awards 2026
Best AI Innovation — NeuroBridge AI Labs, county Leitrim, Ireland
Ethical AI Excellence Award 2026
Recognised for transparent, accountable AI system design and privacy-first architecture
Academic Research
Research paper (pre-publication)
755 lines
Citations & references
30+
Research foundation
Publication pending
Cross-Model Tested