Simon Paris
I design AI systems that don't break.
Most LLM systems running in production today are clever prototypes duct-taped into production. They work in the demo. They fail in ways you can't reproduce. Post-mortems end with "the model did something weird."
I came up through backend systems — the kind where failure has consequences. When teams started pulling me into GenAI work, I kept seeing the same gap: engineers treating LLMs like deterministic functions, then being surprised when they weren't. The STATE framework came out of that gap.

State Beats Intelligence.
A mid-tier model with proper state management outperforms a frontier model running stateless — every single time. The model is not your reliability problem. The architecture around it is.
This is not a philosophy. It's a constraint set — for AI systems that have to work in regulated, revenue-critical, and user-facing contexts where "it usually works" is not a delivery standard.
Five pillars. Zero ambiguity.
Every operation initializes a typed state object. Stage always reflects current execution position.
Every LLM call, API call, and stage transition is logged with all required fields. No black boxes.
Any automated decision affecting an individual has a decision record. Law 25, OSFI, and EU AI Act compliance by construction.
Workflow resumes from step 6 after a crash at step 6. Not from step 1. Idempotency is not optional.
Every LLM output passes a validation gate before any write. Invalid output goes to the error path. Never silent continue.
Five lenses. One spine.
State Beats Intelligence.Naming and classifying LLM failure modes with precision. These are always state failures in disguise.
Demonstrations of STATE pillars in real architecture decisions. Before/after comparisons.
Design patterns that make AI systems tolerant by construction. Validation gates, locks, idempotency.
How I use AI to do the work most people do manually — including figuring out what to ask.
Quebec Law 25, OSFI, EU AI Act as architecture requirements, not compliance checkboxes.
Four ways in.
Notes from production.
Failure taxonomies, defensive patterns, and architecture decisions for LLM systems that have to work. No tutorial content. Practitioner-only.
Read the blog →STATE Readiness Score.
Score your LLM system across the five STATE pillars. Takes 8 minutes. Tells you exactly where your architecture is exposed.
Score your system →No Stack Trace.
A live session on why LLM systems fail in production and how to build the observability layer to find out why. No slides-only theory.
Register for free →Production-Grade LLM Architecture.
A hands-on cohort program. You build a stateful, observable, auditable LLM system from scratch. Designed for teams already in production.
Apply to the cohort →Start with the diagnostic.
If you're running LLM pipelines in production and something feels off — it probably is. The STATE Score tells you what to fix first.