TL;DR: The Problem and Solution
Problem: AI-assisted coding fails when modifying existing systems because we give AI vague specifications.
Solution: Use TLA+ formal specifications as precise contracts that Claude can implement reliably.
Result: Transform Claude from a code generator into a reliable engineering partner that reasons about complex systems.
After months of using Claude for development, I discovered most AI-assisted coding fails not because the AI isn’t smart enough, but because we’re asking it to work from vague specifications. This post shows you how to move beyond “vibe coding” using executable specifications that turn Claude into a reliable engineering partner.
Here’s what changes when you use TLA+ with Claude:
Before (Vibe Coding):
- “Create a task management API”
- Claude guesses at requirements
- Inconsistent behavior across edge cases
- Bugs in corner cases
After (TLA+ Specifications):
- Precise mathematical specification
- Claude implements exactly what you specified
- All edge cases defined upfront
- Properties verified before deployment
The Vibe Coding Problem
AI assistants like Claude are primarily trained on greenfield development patterns. They excel at:
- Writing new functions from scratch
- Implementing well-known algorithms
- Creating boilerplate code
But they struggle with:
- Understanding implicit behavioral contracts in existing code
- Maintaining invariants across system modifications
- Reasoning about state transitions and edge cases
- Preserving non-functional requirements (performance, security, etc.)
The solution isn’t better prompts – it’s better specifications.
Enter Executable Specifications
An executable specification is a formal description of system behavior that can be:
- Verified – Checked for logical consistency
- Validated – Tested against real-world scenarios
- Executed – Run to generate test cases or even implementations
I’ve tried many approaches to precise specifications over the years:
UML and Model Driven Development (2000s-2010s): I used tools like Rational Rose and Visual Paradigm in early 2000s that promised complete code generation from UML models. The reality was different:
- Visual complexity: UML diagrams became unwieldy for anything non-trivial
- Tool lock-in: Proprietary formats and expensive tooling
- Impedance mismatch: The gap between UML models and real code was huge
- Maintenance nightmare: Keeping models and code synchronized was nearly impossible
- Limited expressiveness: UML couldn’t capture complex behavioral contracts
BDD and Gherkin (mid-2000s): I used BDD and Gherkin in mid 2000s, which were better than UML for behavioral specifications, but still limited:
- Structured natural language: Readable but not truly executable
- No logical reasoning: Couldn’t catch design contradictions
- Testing focused: Good for acceptance criteria, poor for system design
TLA+ (present): Takes executable specifications to their logical conclusion:
- Mathematical precision: Eliminates ambiguity completely
- Model checking: Explores all possible execution paths
- Tool independence: Plain text specifications, open source tools
- Behavioral focus: Designed specifically for concurrent and distributed systems
Why TLA+ with Claude?
The magic happens when you combine TLA+’s precision with Claude’s implementation capabilities:
- TLA+ eliminates ambiguity – There’s only one way to interpret a formal specification
- Claude can read TLA+ – It understands the formal syntax and can translate it to code
- Verification catches design flaws – TLA+ model checking finds edge cases you’d miss
- Generated traces become tests – TLA+ execution paths become your test suite

Setting Up Your Claude and TLA+ Environment
Installing Claude Desktop
First, let’s get Claude running on your machine:
# Install via Homebrew (macOS) brew install --cask claude # Or download directly from Anthropic # https://claude.ai/download
- Set up project-specific contexts in ~/.claude/
- Create TLA+ syntax rules for better code generation
- Configure memory settings for specification patterns
Configuring Your Workspace
Once installed, I recommend creating a dedicated workspace structure. Here’s what works for me:
# Create a Claude workspace directory
mkdir -p ~/claude-workspace/{projects,templates,context}
# Add a context file for your coding standards
cat > ~/claude-workspace/context/coding-standards.md << 'EOF'
# My Coding Standards
- Use descriptive variable names
- Functions should do one thing well
- Write tests for all new features
- Handle errors explicitly
- Document complex logic
EOFInstalling TLA+ Tools
Choose based on your workflow
- GUI users: TLA+ Toolbox for visual model checking
- CLI users: tla2tools.jar for CI integration
- Both: VS Code extension for syntax highlighting
# Download TLA+ Tools from https://github.com/tlaplus/tlaplus/releases # Or use Homebrew on macOS brew install --cask tla-plus-toolbox # For command-line usage (recommended for CI) wget https://github.com/tlaplus/tlaplus/releases/download/v1.8.0/tla2tools.jar
VS Code Extension
Install the TLA+ extension for syntax highlighting and basic validation:
code --install-extension alygin.vscode-tlaplus
Your First TLA+ Specification
Let’s start with a simple example to understand the syntax:
--------------------------- MODULE SimpleCounter --------------------------- VARIABLE counter Init == counter = 0 Increment == counter' = counter + 1 Decrement == counter' = counter - 1 Next == Increment \/ Decrement Spec == Init /\ [][Next]_counter TypeInvariant == counter \in Int =============================================================================
This specification defines:
- State: A counter variable
- Initial condition: Counter starts at 0
- Actions: Increment or decrement operations
- Next state relation: Either action can occur
- Invariant: Counter is always an integer
Real-World Example: Task Management API
Now let’s build something real. We’ll create a task management API using TLA+ specifications that Claude can implement in Go.
Step 1: Define the System State
First, we model what our system looks like (TaskManagement.tla):
--------------------------- MODULE TaskManagement ---------------------------
EXTENDS Integers, Sequences, FiniteSets, TLC
CONSTANTS
Users, \* Set of users
MaxTasks, \* Maximum number of tasks
MaxTime, \* Maximum time value for simulation
Titles, \* Set of possible task titles
Descriptions \* Set of possible task descriptions
VARIABLES
tasks, \* Function from task ID to task record
userTasks, \* Function from user ID to set of task IDs
nextTaskId, \* Counter for generating unique task IDs
currentUser, \* Currently authenticated user
clock, \* Global clock for timestamps
sessions \* Active user sessions
\* Task states enumeration with valid transitions
TaskStates == {"pending", "in_progress", "completed", "cancelled", "blocked"}
\* Priority levels
Priorities == {"low", "medium", "high", "critical"}
\* Valid state transitions
ValidTransitions == {
<<"pending", "in_progress">>,
<<"pending", "cancelled">>,
<<"pending", "blocked">>,
<<"in_progress", "completed">>,
<<"in_progress", "cancelled">>,
<<"in_progress", "blocked">>,
<<"in_progress", "pending">>, \* Allow reverting to pending
<<"blocked", "pending">>,
<<"blocked", "in_progress">>,
<<"blocked", "cancelled">>
}
TaskRecord == [
id: Nat,
title: STRING,
description: STRING,
status: TaskStates,
priority: {"low", "medium", "high"},
assignee: Users,
createdAt: Nat,
dueDate: Nat \cup {NULL}
]
\* Type invariants
TypeInvariant ==
/\ tasks \in [Nat -> TaskRecord]
/\ userTasks \in [Users -> SUBSET Nat]
/\ nextTaskId \in Nat
/\ currentUser \in Users \cup {NULL}
Step 2: Define System Actions
Now we specify what operations are possible (TaskManagement.tla):
\* System initialization
Init ==
/\ tasks = [i \in {} |-> CHOOSE x : FALSE] \* Empty function
/\ userTasks = [u \in Users |-> {}]
/\ nextTaskId = 1
/\ currentUser = "NULL"
/\ clock = 0
/\ sessions = [u \in Users |-> FALSE]
\* User authentication
Authenticate(user) ==
/\ user \in Users
/\ ~sessions[user] \* User not already logged in
/\ currentUser' = user
/\ sessions' = [sessions EXCEPT ![user] = TRUE]
/\ UNCHANGED <<tasks, userTasks, nextTaskId, clock>>
\* Create a new task
CreateTask(title, description, priority, dueDate) ==
/\ currentUser # NULL
/\ nextTaskId <= MaxTasks
/\ LET newTask == [
id |-> nextTaskId,
title |-> title,
description |-> description,
status |-> "pending",
priority |-> priority,
assignee |-> currentUser,
createdAt |-> nextTaskId, \* Simplified timestamp
dueDate |-> dueDate
] IN
/\ tasks' = tasks @@ (nextTaskId :> newTask)
/\ userTasks' = [userTasks EXCEPT ![currentUser] = @ \cup {nextTaskId}]
/\ nextTaskId' = nextTaskId + 1
/\ UNCHANGED currentUser
\* Update task status
UpdateTaskStatus(taskId, newStatus) ==
/\ currentUser # NULL
/\ taskId \in DOMAIN tasks
/\ taskId \in userTasks[currentUser]
/\ newStatus \in TaskStates
/\ tasks' = [tasks EXCEPT ![taskId].status = newStatus]
/\ UNCHANGED <<userTasks, nextTaskId, currentUser>>
\* Delete a task
DeleteTask(taskId) ==
/\ currentUser # NULL
/\ taskId \in DOMAIN tasks
/\ taskId \in userTasks[currentUser]
/\ tasks' = [id \in (DOMAIN tasks \ {taskId}) |-> tasks[id]]
/\ userTasks' = [userTasks EXCEPT ![currentUser] = @ \ {taskId}]
/\ UNCHANGED <<nextTaskId, currentUser>>
Step 3: Safety and Liveness Properties
TLA+ shines when defining system properties (TaskManagement.tla):
\* Safety properties
NoOrphanTasks ==
\A taskId \in DOMAIN tasks :
\E user \in Users : taskId \in GetUserTasks(user)
TaskOwnership ==
\A taskId \in DOMAIN tasks :
tasks[taskId].assignee \in Users /\
taskId \in GetUserTasks(tasks[taskId].assignee)
ValidTaskIds ==
\A taskId \in DOMAIN tasks :
/\ taskId < nextTaskId
/\ taskId >= 1
NoDuplicateTaskIds ==
\A t1, t2 \in DOMAIN tasks :
t1 = t2 \/ tasks[t1].id # tasks[t2].id
ValidStateTransitionsInvariant ==
\A taskId \in DOMAIN tasks :
tasks[taskId].status \in TaskStates
ConsistentTimestamps ==
\A taskId \in DOMAIN tasks :
/\ tasks[taskId].createdAt <= tasks[taskId].updatedAt
/\ tasks[taskId].updatedAt <= clock
NoCyclicDependencies ==
LET
\* Transitive closure of dependencies
RECURSIVE TransitiveDeps(_)
TransitiveDeps(taskId) ==
IF ~TaskExists(taskId) THEN {}
ELSE LET directDeps == tasks[taskId].dependencies IN
directDeps \cup
UNION {TransitiveDeps(dep) : dep \in directDeps}
IN
\A taskId \in DOMAIN tasks :
taskId \notin TransitiveDeps(taskId)
AuthenticationRequired ==
\* All task operations require authentication
\A taskId \in DOMAIN tasks :
tasks[taskId].createdBy \in Users
SafetyInvariant ==
/\ NoOrphanTasks
/\ TaskOwnership
/\ ValidTaskIds
/\ NoDuplicateTaskIds
/\ ValidStateTransitionsInvariant
/\ ConsistentTimestamps
/\ NoCyclicDependencies
/\ AuthenticationRequired
\* Next state relation
Next ==
\/ AdvanceTime
\/ \E user \in Users : Authenticate(user)
\/ Logout
\/ \E t \in Titles, d \in Descriptions, p \in Priorities,
u \in Users, dd \in 0..MaxTime \cup {"NULL"},
tags \in SUBSET {"bug", "feature", "enhancement", "documentation"},
deps \in SUBSET DOMAIN tasks :
CreateTask(t, d, p, u, dd, tags, deps)
\/ \E taskId \in DOMAIN tasks, newStatus \in TaskStates :
UpdateTaskStatus(taskId, newStatus)
\/ \E taskId \in DOMAIN tasks, newPriority \in Priorities :
UpdateTaskPriority(taskId, newPriority)
\/ \E taskId \in DOMAIN tasks, newAssignee \in Users :
ReassignTask(taskId, newAssignee)
\/ \E taskId \in DOMAIN tasks, t \in Titles,
d \in Descriptions, dd \in 0..MaxTime \cup {"NULL"} :
UpdateTaskDetails(taskId, t, d, dd)
\/ \E taskId \in DOMAIN tasks : DeleteTask(taskId)
\/ CheckDependencies
\/ \E taskIds \in SUBSET DOMAIN tasks, newStatus \in TaskStates :
taskIds # {} /\ BulkUpdateStatus(taskIds, newStatus)
\* Properties to check
THEOREM TypeCorrectness == Spec => []TypeInvariant
THEOREM SafetyHolds == Spec => []SafetyInvariant
THEOREM LivenessHolds == Spec => (EventualCompletion /\ FairProgress)
THEOREM NoDeadlock == Spec => []<>Next
THEOREM Termination == Spec => <>(\A taskId \in DOMAIN tasks :
tasks[taskId].status \in {"completed", "cancelled"})
=============================================================================
Step 4: Model Checking and Trace Generation
Now we can run TLA+ model checking to verify our specification (TaskManagement.cfg):
\* Model configuration for TaskManagementImproved module
SPECIFICATION Spec
\* Constants definition
CONSTANTS
Users = {alice, bob, charlie}
MaxTasks = 5
MaxTime = 20
Titles = {task1, task2, task3, task4, task5}
Descriptions = {desc1, desc2, desc3}
\* Model values for special constants
CONSTANT
NULL = NULL
EMPTY_STRING = EMPTY_STRING
\* Initial state constraint
CONSTRAINT
/\ nextTaskId <= MaxTasks + 1
/\ clock <= MaxTime
/\ Cardinality(DOMAIN tasks) <= MaxTasks
\* State space reduction (optional, for faster checking)
ACTION_CONSTRAINT
\* Limit number of active sessions
/\ Cardinality({u \in Users : sessions[u] = TRUE}) <= 2
\* Prevent creating too many tasks at once
/\ nextTaskId <= MaxTasks
\* Invariants to check
INVARIANT TypeInvariant
INVARIANT SafetyInvariant
INVARIANT NoOrphanTasks
INVARIANT TaskOwnership
INVARIANT ValidTaskIds
INVARIANT NoDuplicateTaskIds
INVARIANT ValidStateTransitionsInvariant
INVARIANT ConsistentTimestamps
INVARIANT NoCyclicDependencies
INVARIANT AuthenticationRequired
\* Properties to check
PROPERTY EventualCompletion
PROPERTY FairProgress
PROPERTY EventualUnblocking
PROPERTY EventualAuthentication
PROPERTY NoStarvation
\* Check for deadlocks
CHECK_DEADLOCK TRUE
\* View for debugging (optional)
VIEW <<nextTaskId, Cardinality(DOMAIN tasks), clock>>
\* Alias for better state visualization
ALIAS TaskSummary == [
totalTasks |-> Cardinality(DOMAIN tasks),
pendingTasks |-> Cardinality({t \in DOMAIN tasks : tasks[t].status = "pending"}),
inProgressTasks |-> Cardinality({t \in DOMAIN tasks : tasks[t].status = "in_progress"}),
completedTasks |-> Cardinality({t \in DOMAIN tasks : tasks[t].status = "completed"}),
blockedTasks |-> Cardinality({t \in DOMAIN tasks : tasks[t].status = "blocked"}),
activeSessions |-> {u \in Users : sessions[u] = TRUE},
currentUser |-> currentUser,
time |-> clock
]# Run the model checker java -jar tla2tools.jar -config TaskManagement.cfg TaskManagement.tla
TLA+ will explore all possible execution paths and either:
- Verify that all properties hold
- Generate counterexamples showing property violations
Step 5: Claude Implementation with TLA+ Context
Now comes the magic. Instead of vague natural language, we give Claude the precise TLA+ specification:
My prompt to Claude:
Using this TLA+ specification for a task management system, implement a Go REST API that satisfies all the specified behaviors and properties. [Include the full TLA+ specification here] Requirements: 1. Implement all actions defined in the TLA+ spec 2. Ensure all safety properties hold at runtime 3. Use Go's type system to enforce the data structures 4. Include comprehensive tests that cover the execution traces 5. Add proper error handling for violation of preconditions Focus on behavioral correctness over performance optimization.
Claude generates something like (task.go):
// Package domain contains the core business entities matching the TLA+ specification
package domain
import (
"fmt"
"time"
)
// TaskID represents a unique task identifier (maps to TLA+ task ID)
type TaskID int
// UserID represents a user identifier (maps to TLA+ Users set)
type UserID string
// TaskStatus represents the state of a task (maps to TLA+ TaskStates)
type TaskStatus string
const (
StatusPending TaskStatus = "pending"
StatusInProgress TaskStatus = "in_progress"
StatusCompleted TaskStatus = "completed"
StatusCancelled TaskStatus = "cancelled"
StatusBlocked TaskStatus = "blocked"
)
// Priority represents task priority levels (maps to TLA+ Priorities)
type Priority string
const (
PriorityLow Priority = "low"
PriorityMedium Priority = "medium"
PriorityHigh Priority = "high"
PriorityCritical Priority = "critical"
)
// Tag represents task categories (maps to TLA+ tags subset)
type Tag string
const (
TagBug Tag = "bug"
TagFeature Tag = "feature"
TagEnhancement Tag = "enhancement"
TagDocumentation Tag = "documentation"
)
// Task represents a task entity (maps to TLA+ task record)
type Task struct {
ID TaskID `json:"id"`
Title string `json:"title"`
Description string `json:"description"`
Status TaskStatus `json:"status"`
Priority Priority `json:"priority"`
Assignee UserID `json:"assignee"`
CreatedBy UserID `json:"created_by"`
CreatedAt time.Time `json:"created_at"`
UpdatedAt time.Time `json:"updated_at"`
DueDate *time.Time `json:"due_date,omitempty"`
Tags []Tag `json:"tags"`
Dependencies map[TaskID]bool `json:"dependencies"`
}
// ValidTransition represents a valid state transition (maps to TLA+ ValidTransitions)
type ValidTransition struct {
From TaskStatus
To TaskStatus
}
// ValidTransitions defines all allowed state transitions
var ValidTransitions = map[ValidTransition]bool{
{StatusPending, StatusInProgress}: true,
{StatusPending, StatusCancelled}: true,
{StatusPending, StatusBlocked}: true,
{StatusInProgress, StatusCompleted}: true,
{StatusInProgress, StatusCancelled}: true,
{StatusInProgress, StatusBlocked}: true,
{StatusInProgress, StatusPending}: true, // Allow reverting
{StatusBlocked, StatusPending}: true,
{StatusBlocked, StatusInProgress}: true,
{StatusBlocked, StatusCancelled}: true,
}
// IsValidTransition checks if a state transition is valid (maps to TLA+ IsValidTransition)
func IsValidTransition(from, to TaskStatus) bool {
return ValidTransitions[ValidTransition{From: from, To: to}]
}
// CanDelete checks if a task can be deleted (only completed or cancelled)
func (t *Task) CanDelete() bool {
return t.Status == StatusCompleted || t.Status == StatusCancelled
}
// IsBlocked checks if task should be blocked based on dependencies
func (t *Task) IsBlocked(allTasks map[TaskID]*Task) bool {
if len(t.Dependencies) == 0 {
return false
}
for depID := range t.Dependencies {
if dep, exists := allTasks[depID]; exists {
if dep.Status != StatusCompleted {
return true
}
}
}
return false
}
// ShouldUnblock checks if a blocked task can be unblocked
func (t *Task) ShouldUnblock(allTasks map[TaskID]*Task) bool {
if t.Status != StatusBlocked {
return false
}
for depID := range t.Dependencies {
if dep, exists := allTasks[depID]; exists {
if dep.Status != StatusCompleted {
return false
}
}
}
return true
}
// Validate performs domain validation on the task
func (t *Task) Validate() error {
if t.Title == "" {
return fmt.Errorf("task title cannot be empty")
}
if t.Description == "" {
return fmt.Errorf("task description cannot be empty")
}
if !isValidStatus(t.Status) {
return fmt.Errorf("invalid task status: %s", t.Status)
}
if !isValidPriority(t.Priority) {
return fmt.Errorf("invalid task priority: %s", t.Priority)
}
if t.Assignee == "" {
return fmt.Errorf("task must have an assignee")
}
if t.CreatedBy == "" {
return fmt.Errorf("task must have a creator")
}
if t.CreatedAt.After(t.UpdatedAt) {
return fmt.Errorf("created time cannot be after updated time")
}
for _, tag := range t.Tags {
if !isValidTag(tag) {
return fmt.Errorf("invalid tag: %s", tag)
}
}
return nil
}
func isValidStatus(status TaskStatus) bool {
switch status {
case StatusPending, StatusInProgress, StatusCompleted, StatusCancelled, StatusBlocked:
return true
default:
return false
}
}
func isValidPriority(priority Priority) bool {
switch priority {
case PriorityLow, PriorityMedium, PriorityHigh, PriorityCritical:
return true
default:
return false
}
}
func isValidTag(tag Tag) bool {
switch tag {
case TagBug, TagFeature, TagEnhancement, TagDocumentation:
return true
default:
return false
}
}
// Package usecase implements the TLA+ actions as use cases
package usecase
import (
"crypto/rand"
"encoding/hex"
"fmt"
"time"
"github.com/bhatti/sample-task-management/internal/domain"
"github.com/bhatti/sample-task-management/internal/repository"
)
// TaskUseCase implements task-related TLA+ actions
type TaskUseCase struct {
uow repository.UnitOfWork
invariantChecker InvariantChecker
}
// InvariantChecker interface for runtime invariant validation
type InvariantChecker interface {
CheckAllInvariants(state *domain.SystemState) error
CheckTaskInvariants(task *domain.Task, state *domain.SystemState) error
CheckTransitionInvariant(from, to domain.TaskStatus) error
}
// NewTaskUseCase creates a new task use case
func NewTaskUseCase(uow repository.UnitOfWork, checker InvariantChecker) *TaskUseCase {
return &TaskUseCase{
uow: uow,
invariantChecker: checker,
}
}
// Authenticate implements TLA+ Authenticate action
func (uc *TaskUseCase) Authenticate(userID domain.UserID) (*domain.Session, error) {
// Preconditions from TLA+:
// - user \in Users
// - ~sessions[user]
user, err := uc.uow.Users().GetUser(userID)
if err != nil {
return nil, fmt.Errorf("user not found: %w", err)
}
// Check if user already has an active session
existingSession, _ := uc.uow.Sessions().GetSessionByUser(userID)
if existingSession != nil && existingSession.IsValid() {
return nil, fmt.Errorf("user %s already has an active session", userID)
}
// Create new session
token := generateToken()
session := &domain.Session{
UserID: user.ID,
Token: token,
Active: true,
CreatedAt: time.Now(),
ExpiresAt: time.Now().Add(24 * time.Hour),
}
// Update state
if err := uc.uow.Sessions().CreateSession(session); err != nil {
return nil, fmt.Errorf("failed to create session: %w", err)
}
if err := uc.uow.SystemState().SetCurrentUser(&userID); err != nil {
return nil, fmt.Errorf("failed to set current user: %w", err)
}
// Check invariants
state, _ := uc.uow.SystemState().GetSystemState()
if err := uc.invariantChecker.CheckAllInvariants(state); err != nil {
uc.uow.Rollback()
return nil, fmt.Errorf("invariant violation: %w", err)
}
return session, nil
}
// CreateTask implements TLA+ CreateTask action
func (uc *TaskUseCase) CreateTask(
title, description string,
priority domain.Priority,
assignee domain.UserID,
dueDate *time.Time,
tags []domain.Tag,
dependencies []domain.TaskID,
) (*domain.Task, error) {
// Preconditions from TLA+:
// - currentUser # NULL
// - currentUser \in Users
// - nextTaskId <= MaxTasks
// - deps \subseteq DOMAIN tasks
// - \A dep \in deps : tasks[dep].status # "cancelled"
currentUser, err := uc.uow.SystemState().GetCurrentUser()
if err != nil || currentUser == nil {
return nil, fmt.Errorf("authentication required")
}
// Check max tasks limit
nextID, err := uc.uow.SystemState().GetNextTaskID()
if err != nil {
return nil, fmt.Errorf("failed to get next task ID: %w", err)
}
if nextID > domain.MaxTasks {
return nil, fmt.Errorf("maximum number of tasks (%d) reached", domain.MaxTasks)
}
// Validate dependencies
allTasks, err := uc.uow.Tasks().GetAllTasks()
if err != nil {
return nil, fmt.Errorf("failed to get tasks: %w", err)
}
depMap := make(map[domain.TaskID]bool)
for _, depID := range dependencies {
depTask, exists := allTasks[depID]
if !exists {
return nil, fmt.Errorf("dependency task %d does not exist", depID)
}
if depTask.Status == domain.StatusCancelled {
return nil, fmt.Errorf("cannot depend on cancelled task %d", depID)
}
depMap[depID] = true
}
// Check for cyclic dependencies
if err := uc.checkCyclicDependencies(nextID, depMap, allTasks); err != nil {
return nil, err
}
// Determine initial status based on dependencies
status := domain.StatusPending
if len(dependencies) > 0 {
// Check if all dependencies are completed
allCompleted := true
for depID := range depMap {
if allTasks[depID].Status != domain.StatusCompleted {
allCompleted = false
break
}
}
if !allCompleted {
status = domain.StatusBlocked
}
}
// Create task
task := &domain.Task{
ID: nextID,
Title: title,
Description: description,
Status: status,
Priority: priority,
Assignee: assignee,
CreatedBy: *currentUser,
CreatedAt: time.Now(),
UpdatedAt: time.Now(),
DueDate: dueDate,
Tags: tags,
Dependencies: depMap,
}
// Validate task
if err := task.Validate(); err != nil {
return nil, fmt.Errorf("task validation failed: %w", err)
}
// Save task
if err := uc.uow.Tasks().CreateTask(task); err != nil {
return nil, fmt.Errorf("failed to create task: %w", err)
}
// Increment next task ID
if _, err := uc.uow.SystemState().IncrementNextTaskID(); err != nil {
return nil, fmt.Errorf("failed to increment task ID: %w", err)
}
// Check invariants
state, _ := uc.uow.SystemState().GetSystemState()
if err := uc.invariantChecker.CheckAllInvariants(state); err != nil {
uc.uow.Rollback()
return nil, fmt.Errorf("invariant violation after task creation: %w", err)
}
return task, nil
}
// UpdateTaskStatus implements TLA+ UpdateTaskStatus action
func (uc *TaskUseCase) UpdateTaskStatus(taskID domain.TaskID, newStatus domain.TaskStatus) error {
// Preconditions from TLA+:
// - currentUser # NULL
// - TaskExists(taskId)
// - taskId \in GetUserTasks(currentUser)
// - IsValidTransition(tasks[taskId].status, newStatus)
// - newStatus = "in_progress" => all dependencies completed
currentUser, err := uc.uow.SystemState().GetCurrentUser()
if err != nil || currentUser == nil {
return fmt.Errorf("authentication required")
}
task, err := uc.uow.Tasks().GetTask(taskID)
if err != nil {
return fmt.Errorf("task not found: %w", err)
}
// Check user owns the task
userTasks, err := uc.uow.SystemState().GetUserTasks(*currentUser)
if err != nil {
return fmt.Errorf("failed to get user tasks: %w", err)
}
hasTask := false
for _, id := range userTasks {
if id == taskID {
hasTask = true
break
}
}
if !hasTask {
return fmt.Errorf("user does not have access to task %d", taskID)
}
// Check valid transition
if !domain.IsValidTransition(task.Status, newStatus) {
return fmt.Errorf("invalid transition from %s to %s", task.Status, newStatus)
}
// Check dependencies if moving to in_progress
if newStatus == domain.StatusInProgress {
allTasks, _ := uc.uow.Tasks().GetAllTasks()
for depID := range task.Dependencies {
if depTask, exists := allTasks[depID]; exists {
if depTask.Status != domain.StatusCompleted {
return fmt.Errorf("cannot start task: dependency %d is not completed", depID)
}
}
}
}
// Update status
task.Status = newStatus
task.UpdatedAt = time.Now()
if err := uc.uow.Tasks().UpdateTask(task); err != nil {
return fmt.Errorf("failed to update task: %w", err)
}
// Check invariants
state, _ := uc.uow.SystemState().GetSystemState()
if err := uc.invariantChecker.CheckAllInvariants(state); err != nil {
uc.uow.Rollback()
return fmt.Errorf("invariant violation: %w", err)
}
return nil
}
...Step 6: TLA+ Generated Tests
The real power comes when we use TLA+ execution traces to generate comprehensive tests:
My prompt to Claude:
Generate Go tests that verify the implementation satisfies the TLA+ specification. Create test cases that: 1. Test all TLA+ actions with valid preconditions 2. Test safety property violations 3. Test edge cases from the TLA+ model boundary conditions 4. Use property-based testing where appropriate Include tests that would catch the execution traces TLA+ model checker explores.
Claude generates (invariant_test.go):
// Package property implements property-based tests for TLA+ invariants
package property
import (
"math/rand"
"testing"
"time"
"github.com/bhatti/sample-task-management/internal/domain"
"github.com/bhatti/sample-task-management/internal/infrastructure/memory"
"github.com/bhatti/sample-task-management/internal/usecase"
"github.com/bhatti/sample-task-management/pkg/invariants"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
)
// TestInvariantsHoldAfterOperations verifies invariants hold after each operation
func TestInvariantsHoldAfterOperations(t *testing.T) {
repo := memory.NewMemoryRepository()
uow := memory.NewMemoryUnitOfWork(repo)
checker := invariants.NewInvariantChecker()
uc := usecase.NewTaskUseCase(uow, checker)
// Setup initial users
users := []domain.UserID{"alice", "bob", "charlie"}
for _, userID := range users {
user := &domain.User{
ID: userID,
Name: string(userID),
Email: string(userID) + "@example.com",
JoinedAt: time.Now(),
}
require.NoError(t, repo.CreateUser(user))
}
// Property: Invariants hold after authentication
t.Run("InvariantsAfterAuthentication", func(t *testing.T) {
for _, userID := range users {
session, err := uc.Authenticate(userID)
assert.NoError(t, err)
assert.NotNil(t, session)
state, _ := repo.GetSystemState()
assert.NoError(t, checker.CheckAllInvariants(state))
// Cleanup
_ = uc.Logout(userID)
}
})
// Property: Invariants hold after task creation
t.Run("InvariantsAfterTaskCreation", func(t *testing.T) {
uc.Authenticate("alice")
for i := 0; i < 10; i++ {
task, err := uc.CreateTask(
"Task "+string(rune(i)),
"Description",
randomPriority(),
randomUser(users),
randomDueDate(),
randomTags(),
[]domain.TaskID{}, // No dependencies initially
)
assert.NoError(t, err)
assert.NotNil(t, task)
state, _ := repo.GetSystemState()
assert.NoError(t, checker.CheckAllInvariants(state))
}
})
// Property: Invariants hold after status transitions
t.Run("InvariantsAfterStatusTransitions", func(t *testing.T) {
uc.Authenticate("alice")
// Create a task
task, _ := uc.CreateTask(
"Test Task",
"Description",
domain.PriorityMedium,
"alice",
nil,
[]domain.Tag{domain.TagFeature},
[]domain.TaskID{},
)
// Valid transitions
validTransitions := []domain.TaskStatus{
domain.StatusInProgress,
domain.StatusCompleted,
}
for _, status := range validTransitions {
err := uc.UpdateTaskStatus(task.ID, status)
if err == nil {
state, _ := repo.GetSystemState()
assert.NoError(t, checker.CheckAllInvariants(state))
}
}
})
// Property: No cyclic dependencies can be created
t.Run("NoCyclicDependencies", func(t *testing.T) {
uc.Authenticate("alice")
// Create tasks with potential cycles
task1, _ := uc.CreateTask("Task1", "Desc", domain.PriorityLow, "alice", nil, nil, []domain.TaskID{})
task2, _ := uc.CreateTask("Task2", "Desc", domain.PriorityLow, "alice", nil, nil, []domain.TaskID{task1.ID})
task3, _ := uc.CreateTask("Task3", "Desc", domain.PriorityLow, "alice", nil, nil, []domain.TaskID{task2.ID})
// Attempting to create a cycle should fail
_, err := uc.CreateTask("Task4", "Desc", domain.PriorityLow, "alice", nil, nil,
[]domain.TaskID{task3.ID, task1.ID}) // This would create a cycle
assert.NoError(t, err)
// Even if it doesn't fail explicitly, invariants should catch it
state, _ := repo.GetSystemState()
assert.NoError(t, checker.CheckAllInvariants(state))
})
}
// TestTransitionInvariants tests state transition validity
func TestTransitionInvariants(t *testing.T) {
checker := invariants.NewInvariantChecker()
// Test all valid transitions
validTransitions := []struct {
from domain.TaskStatus
to domain.TaskStatus
}{
{domain.StatusPending, domain.StatusInProgress},
{domain.StatusPending, domain.StatusCancelled},
{domain.StatusInProgress, domain.StatusCompleted},
{domain.StatusInProgress, domain.StatusCancelled},
{domain.StatusBlocked, domain.StatusPending},
{domain.StatusBlocked, domain.StatusCancelled},
}
for _, trans := range validTransitions {
t.Run(string(trans.from)+"_to_"+string(trans.to), func(t *testing.T) {
err := checker.CheckTransitionInvariant(trans.from, trans.to)
assert.NoError(t, err)
})
}
// Test invalid transitions
invalidTransitions := []struct {
from domain.TaskStatus
to domain.TaskStatus
}{
{domain.StatusCompleted, domain.StatusPending},
{domain.StatusCompleted, domain.StatusInProgress},
{domain.StatusCancelled, domain.StatusInProgress},
{domain.StatusPending, domain.StatusCompleted}, // Must go through in_progress
}
for _, trans := range invalidTransitions {
t.Run("Invalid_"+string(trans.from)+"_to_"+string(trans.to), func(t *testing.T) {
err := checker.CheckTransitionInvariant(trans.from, trans.to)
assert.Error(t, err)
})
}
}
// TestPropertyTaskOwnership verifies task ownership invariants
func TestPropertyTaskOwnership(t *testing.T) {
repo := memory.NewMemoryRepository()
uow := memory.NewMemoryUnitOfWork(repo)
checker := invariants.NewInvariantChecker()
uc := usecase.NewTaskUseCase(uow, checker)
// Setup users
users := []domain.UserID{"alice", "bob"}
for _, userID := range users {
user := &domain.User{
ID: userID,
Name: string(userID),
Email: string(userID) + "@example.com",
JoinedAt: time.Now(),
}
repo.CreateUser(user)
}
// Property: Task reassignment maintains ownership invariants
t.Run("ReassignmentMaintainsOwnership", func(t *testing.T) {
uc.Authenticate("alice")
// Create task assigned to Alice
task, err := uc.CreateTask(
"Test Task",
"Description",
domain.PriorityHigh,
"alice",
nil,
[]domain.Tag{domain.TagBug},
[]domain.TaskID{},
)
require.NoError(t, err)
// Check initial ownership
state, _ := repo.GetSystemState()
assert.NoError(t, checker.CheckAllInvariants(state))
aliceTasks := state.GetUserTasks("alice")
assert.Contains(t, aliceTasks, task.ID)
// Reassign to Bob
err = uc.ReassignTask(task.ID, "bob")
require.NoError(t, err)
// Check ownership after reassignment
state, _ = repo.GetSystemState()
assert.NoError(t, checker.CheckAllInvariants(state))
aliceTasks = state.GetUserTasks("alice")
bobTasks := state.GetUserTasks("bob")
assert.NotContains(t, aliceTasks, task.ID)
assert.Contains(t, bobTasks, task.ID)
})
}
// TestPropertyConcurrentOperations tests invariants under concurrent operations
func TestPropertyConcurrentOperations(t *testing.T) {
repo := memory.NewMemoryRepository()
uow := memory.NewMemoryUnitOfWork(repo)
checker := invariants.NewInvariantChecker()
// Setup users
users := []domain.UserID{"user1", "user2", "user3"}
for _, userID := range users {
user := &domain.User{
ID: userID,
Name: string(userID),
Email: string(userID) + "@example.com",
JoinedAt: time.Now(),
}
repo.CreateUser(user)
}
// Run concurrent operations
done := make(chan bool, len(users))
for _, userID := range users {
go func(uid domain.UserID) {
uc := usecase.NewTaskUseCase(uow, checker)
// Authenticate
uc.Authenticate(uid)
// Create multiple tasks
for i := 0; i < 5; i++ {
uc.CreateTask(
"Task",
"Description",
randomPriority(),
uid,
nil,
randomTags(),
[]domain.TaskID{},
)
// Random delay
time.Sleep(time.Duration(rand.Intn(10)) * time.Millisecond)
}
done <- true
}(userID)
}
// Wait for all goroutines
for i := 0; i < len(users); i++ {
<-done
}
// Check invariants after concurrent operations
state, _ := repo.GetSystemState()
assert.NoError(t, checker.CheckAllInvariants(state))
}
// Helper functions
func randomPriority() domain.Priority {
priorities := []domain.Priority{
domain.PriorityLow,
domain.PriorityMedium,
domain.PriorityHigh,
domain.PriorityCritical,
}
return priorities[rand.Intn(len(priorities))]
}
func randomUser(users []domain.UserID) domain.UserID {
return users[rand.Intn(len(users))]
}
func randomDueDate() *time.Time {
if rand.Float32() < 0.5 {
return nil
}
due := time.Now().Add(time.Duration(rand.Intn(30)) * 24 * time.Hour)
return &due
}
func randomTags() []domain.Tag {
allTags := []domain.Tag{
domain.TagBug,
domain.TagFeature,
domain.TagEnhancement,
domain.TagDocumentation,
}
numTags := rand.Intn(len(allTags) + 1)
if numTags == 0 {
return nil
}
tags := make([]domain.Tag, 0, numTags)
used := make(map[domain.Tag]bool)
for len(tags) < numTags {
tag := allTags[rand.Intn(len(allTags))]
if !used[tag] {
tags = append(tags, tag)
used[tag] = true
}
}
return tags
}
Advanced TLA+ Patterns with Claude
Modeling Concurrent Operations
One of TLA+’s strengths is modeling concurrent systems. Let’s extend our specification to handle concurrent task updates:
\* Concurrent task updates with conflict resolution
ConcurrentUpdateTask(taskId, newStatus, version) ==
/\ currentUser # NULL
/\ taskId \in DOMAIN tasks
/\ taskId \in userTasks[currentUser]
/\ tasks[taskId].version = version \* Optimistic concurrency control
/\ tasks' = [tasks EXCEPT ![taskId] = [
@ EXCEPT
!.status = newStatus,
!.version = @ + 1,
!.lastModified = currentUser
]]
/\ UNCHANGED <<userTasks, nextTaskId, currentUser>>
Prompt to Claude:
Implement optimistic concurrency control for the task updates based on this TLA+ specification. Include version tracking and conflict detection.
Modeling Complex Business Rules
TLA+ excels at capturing complex business logic:
\* Business rule: High priority tasks cannot be cancelled directly
ValidStatusTransition(currentStatus, newStatus, priority) ==
\/ newStatus = currentStatus
\/ /\ currentStatus = "pending"
/\ newStatus \in {"in_progress", "cancelled"}
\/ /\ currentStatus = "in_progress"
/\ newStatus \in {"completed", "pending"}
\/ /\ currentStatus = "in_progress"
/\ newStatus = "cancelled"
/\ priority # "high" \* High priority tasks cannot be cancelled
Lessons Learned
After applying this TLA+ approach to several experimental projects, here are the key insights:
1. Start Small
Begin with core actions and properties. TLA+ specifications can grow complex quickly, so start with the essential behaviors:
\* Start with basic CRUD Init, CreateTask, UpdateTask, DeleteTask \* Add complexity incrementally Authentication, Authorization, Concurrency, Business Rules
Avoid Initially: Complex distributed systems, performance-critical algorithms
Graduate To: Multi-service interactions, complex business logic
2. Properties Drive Design
Writing TLA+ properties often reveals design flaws before implementation:
\* This property might fail, revealing a design issue
ConsistencyProperty ==
\A user \in Users:
\A taskId \in userTasks[user]:
/\ taskId \in DOMAIN tasks
/\ tasks[taskId].assignee = user
/\ tasks[taskId].status # "deleted" \* Soft delete consideration
3. Model Checking Finds Edge Cases
TLA+ model checking explores execution paths you’d never think to test:
# TLA+ finds this counterexample: # Step 1: User1 creates Task1 # Step 2: User1 deletes Task1 # Step 3: User2 creates Task2 (gets same ID due to reuse) # Step 4: User1 tries to update Task1 -> Security violation!
This led to using UUIDs instead of incrementing integers for task IDs.
4. Generated Tests Are Comprehensive
TLA+ execution traces become your regression test suite. When Claude implements based on TLA+ specs, you get:
- Complete coverage – All specification paths tested
- Edge case detection – Boundary conditions from model checking
- Behavioral contracts – Tests verify actual system properties
Documentation Generation
Prompt to Claude:
Generate API documentation from this TLA+ specification that includes: 1. Endpoint descriptions derived from TLA+ actions 2. Request/response schemas from TLA+ data structures 3. Error conditions from TLA+ preconditions 4. Behavioral guarantees from TLA+ properties
Code Review Guidelines
With TLA+ specifications, code reviews become more focused:
- Does implementation satisfy the TLA+ spec?
- Are all preconditions checked?
- Do safety properties hold?
- Are error conditions handled as specified?
Comparing Specification Approaches
| Approach | Precision | AI Effectiveness | Maintenance | Learning Curve | Tool Complexity | Code Generation |
|---|---|---|---|---|---|---|
| Vibe Coding | Low | Inconsistent | High | Low | Low | N/A |
| UML/MDD | Medium | Poor | Very High | High | Very High | Brittle |
| BDD/Gherkin | Medium | Better | Medium | Medium | Low | Limited |
| TLA+ Specs | High | Excellent | Low | High | Low | Reliable |
Tools and Resources
Essential TLA+ Resources
- Learn TLA+: https://learntla.com – Interactive tutorial
- TLA+ Video Course: Leslie Lamport’s official course
- Practical TLA+: Hillel Wayne’s book – focus on software systems
- TLA+ Examples: https://github.com/tlaplus/Examples
Common Mistakes
1. Avoid These Mistakes
? Writing TLA+ like code
\* Wrong - this looks like pseudocode
CreateTask ==
if currentUser != null then
task = new Task()? Writing TLA+ as mathematical relations
\* Right - mathematical specification
CreateTask ==
/\ currentUser # NULL
/\ tasks' = tasks @@ (nextTaskId :> newTask)? Asking Claude to “fix the TLA+ to match the code”
The spec is the truth – fix the code to match the spec
? Asking Claude to “implement this TLA+ specification correctly”
? Specification scope creep: Starting with entire system architecture ? Incremental approach: Begin with one core workflow, expand gradually
2. Claude Integration Pitfalls
? “Fix the spec to match my code”: Treating specifications as documentation ? “Fix the code to match the spec”: Specifications are the source of truth
3. The Context Overload Trap
Problem: Dumping too much information at once
Solution: Break complex features into smaller, focused requests
4. The “Fix My Test” Antipattern
Problem: When tests fail, asking Claude to modify the test instead of the code
Solution: Always fix the implementation, not the test (unless the test is genuinely wrong)
5. The Blind Trust Mistake
Problem: Accepting generated code without understanding it
Solution: Always review and understand the code before committing
Proven Patterns
1. Save effective prompts:
# ~/.claude/tla-prompts/implementation.md Implement [language] code that satisfies this TLA+ specification: [SPEC] Requirements: - All TLA+ actions become functions/methods - All preconditions become runtime checks - All data structures match TLA+ types - Include comprehensive tests covering specification traces
Create specification templates:
--------------------------- MODULE [ModuleName] --------------------------- EXTENDS Integers, Sequences, FiniteSets CONSTANTS [Constants] VARIABLES [StateVariables] [TypeDefinitions] Init == [InitialConditions] [Actions] Next == [ActionDisjunction] Spec == Init /\ [][Next]_[StateVariables] [SafetyProperties] [LivenessProperties] =============================================================================
2. The “Explain First” Pattern
Before asking Claude to implement something complex, I ask for an explanation:
Explain how you would implement real-time task updates using WebSockets. What are the trade-offs between Socket.io and native WebSockets? What state management challenges should I consider?
3. The “Progressive Enhancement” Pattern
Start simple, then add complexity:
1. First: "Create a basic task model with CRUD operations" 2. Then: "Add validation and error handling" 3. Then: "Add authentication and authorization" 4. Finally: "Add real-time updates and notifications"
4. The “Code Review” Pattern
After implementation, I ask Claude to review its own code:
Review the task API implementation for: - Security vulnerabilities - Performance issues - Code style consistency - Missing error cases Be critical and suggest improvements.
What’s Next
As I’ve developed this TLA+/Claude workflow, I’ve realized we’re approaching something profound: specifications as the primary artifact. Instead of writing code and hoping it’s correct, we’re defining correct behavior formally and letting AI generate the implementation. This inverts the traditional relationship between specification and code.
Implications for Software Engineering
- Design-first development becomes natural
- Bug prevention replaces bug fixing
- Refactoring becomes re-implementation from stable specs
- Documentation is always up-to-date (it’s the spec)
I’m currently experimenting with:
- TLA+ to test case generation – Automated comprehensive testing
- Multi-language implementations – Same spec, different languages
- Specification composition – Building larger systems from verified components
- Quint specifications – A modern executable specification language with simpler syntax than TLA+
Conclusion: The End of Vibe Coding
After using TLA+ with Claude, I can’t go back to vibe coding. The precision, reliability, and confidence that comes from executable specifications has transformed how I build software. The complete working example—TLA+ specs, Go implementation, comprehensive tests, and CI/CD pipeline—is available at github.com/bhatti/sample-task-management.
Yes, there’s a learning curve. Yes, writing TLA+ specifications takes time upfront. But the payoff—in terms of correctness, maintainability, and development speed—is extraordinary. Claude becomes not just a code generator, but a reliable engineering partner that can reason about complex systems precisely because we’ve given it precise specifications to work from. We’re moving from “code and hope” to “specify and know”—and that changes everything.