C-by-B Dev Notes

What the Evaluator Needs to Be

The previous posts in this series made the case for why behavioral alignment alone won’t hold once AI systems gain memory, tool use, and recursive self-improvement. Constraint-by-Balance proposes a structural answer: embed harm-balancing logic directly into the agent’s runtime flow, so that constraint operates independently of optimization. This post lays out what that means in
Read more

April 3, 2026
A Model of AI Agent Types

In the last two posts look at motivations for the C-by-B architecture and looked at how current AI behaviors hint at more dire future alignment issues. With this post we are switching from concerns to remedies. We will start by grounding the C-by-B architecture in a model of AI agent types. Efficiency and Efficacy –
Read more

March 6, 2026
Why today’s AI behaviors hint at more dire alignment futures

LLMs and AI agents are edging toward systems that learn, adapt, and reorganize themselves. Even in today’s constrained settings, we’ve already seen glimpses of behaviors that, if allowed to evolve under continuous learning, could destabilize into something far more dangerous. This post examines three such signals. Each is observable now, each becomes more severe when
Read more

September 28, 2025
The Motivation for Constraint-by-Balance: The Safety Gap After Deployment

What does the future look like once it’s populated with all manner of AI agents? Do our current safety approaches fully encompass the risks associated with that future? The best-known approaches to AI safety (RLHF, Constitutional AI, scalable oversight, interpretability research) have made remarkable progress at aligning model behavior during training and evaluation. These methods
Read more

September 26, 2025