Practical writing
no content theatre

AI systems, delivery decisions, product operations and the mistakes worth writing down. Written by the senior team between engagements, not by a content calendar.

Orzed Models & Agents

Orzed Intake Agent, brief comprehension at the front door

The Intake Agent reads every customer brief that enters the Orzed Console. Built on Pulse, produces a structured Intake Report for the Technical Review Team.

6 min read intakeagents
Security and Trust

Prompt Injection Defence Beyond Input Filtering

Input filtering alone is not a defence against prompt injection. The layered architecture that keeps an LLM-driven system from being walked off its rails.

6 min read prompt-injectionai-security
Agent Systems

When to Use an Agent and When to Use a Pipeline

Agentic loops cost more, fail strangely and resist debugging. The honest test for whether your problem needs an agent or just a deterministic pipeline.

6 min read agentsarchitecture
LLM Cost Engineering

Context Window Economics: The Hidden Bill on RAG

Stuffing context costs money on every call, even on tokens the model ignored. The discipline that keeps RAG context relevant, ranked and compressed.

6 min read context-windowrag
AI Engineering

Evaluating LLMs Without a Research Team

A working evaluation gate that a small engineering team can build in a week, with the assertions, scoring and failure modes that make it production-credible.

6 min read llm-evaluationai-testing
Security and Trust

EU AI Act Mapping for Engineering Teams

What the EU AI Act actually requires of an engineering team: the four risk tiers, the documentation burden, and the timeline that already started in 2025.

7 min read eu-ai-actcompliance
AI Engineering

Prompt Registry: YAML File, Database Table, or Service?

Three working shapes for a production prompt registry, with the trade-offs that decide which one fits a team of three, thirty, or three hundred.

7 min read prompt-registryai-engineering
AI Engineering

System Prompts That Survive Three Model Upgrades

A system prompt tuned to one model's quirks breaks on the next model. The structural patterns that decouple intent from model-specific tuning.

4 min read system-promptllm
LLM Cost Engineering

Prompt Caching: Where It Pays Back and Where It Does Not

Provider-side prompt caching cuts cached input cost by up to 90 percent. The pattern that earns the discount and the configurations that waste it.

5 min read prompt-cachingllm-cost
AI Engineering

LLM Output Streaming: The Edge Cases That Bite

Token streaming is the default for chat UIs and the source of subtle bugs. Partial JSON, truncated outputs, retries and the patterns that handle them.

6 min read streamingllm
Data Platform

When RAG Actually Helps and When It Hides Bad Retrieval

Retrieval-augmented generation looks like an answer to every grounding problem. The honest test for whether you need RAG, fine-tuning or a cleaner data source.

7 min read ragretrieval
Data Platform

Vector Store Sizing: The Cost Truth Nobody Tells You

What a million vectors actually costs across Pinecone, Qdrant, Weaviate and pgvector, and the configuration choices that move the bill by 5x.

6 min read vector-databaserag