Beyond Vibe Coding: Using TLA+ and Executable Specifications with Claude

August 25, 2025

Beyond Vibe Coding: Using TLA+ and Executable Specifications with Claude

Filed under: Computing,Uncategorized — admin @ 9:45 pm

TL;DR: The Problem and Solution

Problem: AI-assisted coding fails when modifying existing systems because we give AI vague specifications.

Solution: Use TLA+ formal specifications as precise contracts that Claude can implement reliably.

Result: Transform Claude from a code generator into a reliable engineering partner that reasons about complex systems.

After months of using Claude for development, I discovered most AI-assisted coding fails not because the AI isn’t smart enough, but because we’re asking it to work from vague specifications. This post shows you how to move beyond “vibe coding” using executable specifications that turn Claude into a reliable engineering partner.

Here’s what changes when you use TLA+ with Claude:

Before (Vibe Coding):

“Create a task management API”
Claude guesses at requirements
Inconsistent behavior across edge cases
Bugs in corner cases

After (TLA+ Specifications):

Precise mathematical specification
Claude implements exactly what you specified
All edge cases defined upfront
Properties verified before deployment

The Vibe Coding Problem

AI assistants like Claude are primarily trained on greenfield development patterns. They excel at:

Writing new functions from scratch
Implementing well-known algorithms
Creating boilerplate code

But they struggle with:

Understanding implicit behavioral contracts in existing code
Maintaining invariants across system modifications
Reasoning about state transitions and edge cases
Preserving non-functional requirements (performance, security, etc.)

The solution isn’t better prompts – it’s better specifications.

Enter Executable Specifications

An executable specification is a formal description of system behavior that can be:

Verified – Checked for logical consistency
Validated – Tested against real-world scenarios
Executed – Run to generate test cases or even implementations

I’ve tried many approaches to precise specifications over the years:

UML and Model Driven Development (2000s-2010s): I used tools like Rational Rose and Visual Paradigm in early 2000s that promised complete code generation from UML models. The reality was different:

Visual complexity: UML diagrams became unwieldy for anything non-trivial
Tool lock-in: Proprietary formats and expensive tooling
Impedance mismatch: The gap between UML models and real code was huge
Maintenance nightmare: Keeping models and code synchronized was nearly impossible
Limited expressiveness: UML couldn’t capture complex behavioral contracts

BDD and Gherkin (mid-2000s): I used BDD and Gherkin in mid 2000s, which were better than UML for behavioral specifications, but still limited:

Structured natural language: Readable but not truly executable
No logical reasoning: Couldn’t catch design contradictions
Testing focused: Good for acceptance criteria, poor for system design

TLA+ (present): Takes executable specifications to their logical conclusion:

Mathematical precision: Eliminates ambiguity completely
Model checking: Explores all possible execution paths
Tool independence: Plain text specifications, open source tools
Behavioral focus: Designed specifically for concurrent and distributed systems

Why TLA+ with Claude?

The magic happens when you combine TLA+’s precision with Claude’s implementation capabilities:

TLA+ eliminates ambiguity – There’s only one way to interpret a formal specification
Claude can read TLA+ – It understands the formal syntax and can translate it to code
Verification catches design flaws – TLA+ model checking finds edge cases you’d miss
Generated traces become tests – TLA+ execution paths become your test suite

Setting Up Your Claude and TLA+ Environment

Installing Claude Desktop

First, let’s get Claude running on your machine:

# Install via Homebrew (macOS)
brew install --cask claude

# Or download directly from Anthropic
# https://claude.ai/download

Set up project-specific contexts in ~/.claude/
Create TLA+ syntax rules for better code generation
Configure memory settings for specification patterns

Configuring Your Workspace

Once installed, I recommend creating a dedicated workspace structure. Here’s what works for me:

# Create a Claude workspace directory
mkdir -p ~/claude-workspace/{projects,templates,context}

# Add a context file for your coding standards
cat > ~/claude-workspace/context/coding-standards.md << 'EOF'
# My Coding Standards

- Use descriptive variable names
- Functions should do one thing well
- Write tests for all new features
- Handle errors explicitly
- Document complex logic
EOF

Installing TLA+ Tools

Choose based on your workflow

GUI users: TLA+ Toolbox for visual model checking
CLI users: tla2tools.jar for CI integration
Both: VS Code extension for syntax highlighting

# Download TLA+ Tools from https://github.com/tlaplus/tlaplus/releases
# Or use Homebrew on macOS
brew install --cask tla-plus-toolbox

# For command-line usage (recommended for CI)
wget https://github.com/tlaplus/tlaplus/releases/download/v1.8.0/tla2tools.jar

VS Code Extension

Install the TLA+ extension for syntax highlighting and basic validation:

code --install-extension alygin.vscode-tlaplus

Your First TLA+ Specification

Let’s start with a simple example to understand the syntax:

--------------------------- MODULE SimpleCounter ---------------------------
VARIABLE counter

Init == counter = 0

Increment == counter' = counter + 1

Decrement == counter' = counter - 1

Next == Increment \/ Decrement

Spec == Init /\ [][Next]_counter

TypeInvariant == counter \in Int

=============================================================================

This specification defines:

State: A counter variable
Initial condition: Counter starts at 0
Actions: Increment or decrement operations
Next state relation: Either action can occur
Invariant: Counter is always an integer

Real-World Example: Task Management API

Now let’s build something real. We’ll create a task management API using TLA+ specifications that Claude can implement in Go.

Step 1: Define the System State

First, we model what our system looks like (TaskManagement.tla):

--------------------------- MODULE TaskManagement ---------------------------
EXTENDS Integers, Sequences, FiniteSets, TLC

CONSTANTS
    Users,          \* Set of users
    MaxTasks,       \* Maximum number of tasks
    MaxTime,        \* Maximum time value for simulation
    Titles,         \* Set of possible task titles
    Descriptions    \* Set of possible task descriptions

VARIABLES
    tasks,          \* Function from task ID to task record
    userTasks,      \* Function from user ID to set of task IDs
    nextTaskId,     \* Counter for generating unique task IDs
    currentUser,    \* Currently authenticated user
    clock,          \* Global clock for timestamps
    sessions        \* Active user sessions

\* Task states enumeration with valid transitions
TaskStates == {"pending", "in_progress", "completed", "cancelled", "blocked"}

\* Priority levels
Priorities == {"low", "medium", "high", "critical"}

\* Valid state transitions
ValidTransitions == {
    <<"pending", "in_progress">>,
    <<"pending", "cancelled">>,
    <<"pending", "blocked">>,
    <<"in_progress", "completed">>,
    <<"in_progress", "cancelled">>,
    <<"in_progress", "blocked">>,
    <<"in_progress", "pending">>,      \* Allow reverting to pending
    <<"blocked", "pending">>,
    <<"blocked", "in_progress">>,
    <<"blocked", "cancelled">>
}


TaskRecord == [
    id: Nat,
    title: STRING,
    description: STRING,
    status: TaskStates,
    priority: {"low", "medium", "high"},
    assignee: Users,
    createdAt: Nat,
    dueDate: Nat \cup {NULL}
]

\* Type invariants
TypeInvariant == 
    /\ tasks \in [Nat -> TaskRecord]
    /\ userTasks \in [Users -> SUBSET Nat]
    /\ nextTaskId \in Nat
    /\ currentUser \in Users \cup {NULL}

Step 2: Define System Actions

Now we specify what operations are possible (TaskManagement.tla):

\* System initialization
Init ==
    /\ tasks = [i \in {} |-> CHOOSE x : FALSE]  \* Empty function
    /\ userTasks = [u \in Users |-> {}]
    /\ nextTaskId = 1
    /\ currentUser = "NULL"
    /\ clock = 0
    /\ sessions = [u \in Users |-> FALSE]

\* User authentication
Authenticate(user) ==
    /\ user \in Users
    /\ ~sessions[user]  \* User not already logged in
    /\ currentUser' = user
    /\ sessions' = [sessions EXCEPT ![user] = TRUE]
    /\ UNCHANGED <<tasks, userTasks, nextTaskId, clock>>

\* Create a new task
CreateTask(title, description, priority, dueDate) ==
    /\ currentUser # NULL
    /\ nextTaskId <= MaxTasks
    /\ LET newTask == [
           id |-> nextTaskId,
           title |-> title,
           description |-> description,
           status |-> "pending",
           priority |-> priority,
           assignee |-> currentUser,
           createdAt |-> nextTaskId, \* Simplified timestamp
           dueDate |-> dueDate
       ] IN
       /\ tasks' = tasks @@ (nextTaskId :> newTask)
       /\ userTasks' = [userTasks EXCEPT ![currentUser] = @ \cup {nextTaskId}]
       /\ nextTaskId' = nextTaskId + 1
       /\ UNCHANGED currentUser

\* Update task status
UpdateTaskStatus(taskId, newStatus) ==
    /\ currentUser # NULL
    /\ taskId \in DOMAIN tasks
    /\ taskId \in userTasks[currentUser]
    /\ newStatus \in TaskStates
    /\ tasks' = [tasks EXCEPT ![taskId].status = newStatus]
    /\ UNCHANGED <<userTasks, nextTaskId, currentUser>>

\* Delete a task
DeleteTask(taskId) ==
    /\ currentUser # NULL
    /\ taskId \in DOMAIN tasks
    /\ taskId \in userTasks[currentUser]
    /\ tasks' = [id \in (DOMAIN tasks \ {taskId}) |-> tasks[id]]
    /\ userTasks' = [userTasks EXCEPT ![currentUser] = @ \ {taskId}]
    /\ UNCHANGED <<nextTaskId, currentUser>>

Step 3: Safety and Liveness Properties

TLA+ shines when defining system properties (TaskManagement.tla):

\* Safety properties
NoOrphanTasks ==
    \A taskId \in DOMAIN tasks :
        \E user \in Users : taskId \in GetUserTasks(user)

TaskOwnership ==
    \A taskId \in DOMAIN tasks :
        tasks[taskId].assignee \in Users /\
        taskId \in GetUserTasks(tasks[taskId].assignee)

ValidTaskIds ==
    \A taskId \in DOMAIN tasks : 
        /\ taskId < nextTaskId
        /\ taskId >= 1

NoDuplicateTaskIds ==
    \A t1, t2 \in DOMAIN tasks :
        t1 = t2 \/ tasks[t1].id # tasks[t2].id

ValidStateTransitionsInvariant ==
    \A taskId \in DOMAIN tasks :
        tasks[taskId].status \in TaskStates

ConsistentTimestamps ==
    \A taskId \in DOMAIN tasks :
        /\ tasks[taskId].createdAt <= tasks[taskId].updatedAt
        /\ tasks[taskId].updatedAt <= clock

NoCyclicDependencies ==
    LET
        \* Transitive closure of dependencies
        RECURSIVE TransitiveDeps(_)
        TransitiveDeps(taskId) ==
            IF ~TaskExists(taskId) THEN {}
            ELSE LET directDeps == tasks[taskId].dependencies IN
                 directDeps \cup 
                 UNION {TransitiveDeps(dep) : dep \in directDeps}
    IN
    \A taskId \in DOMAIN tasks :
        taskId \notin TransitiveDeps(taskId)

AuthenticationRequired ==
    \* All task operations require authentication
    \A taskId \in DOMAIN tasks :
        tasks[taskId].createdBy \in Users

SafetyInvariant ==
    /\ NoOrphanTasks
    /\ TaskOwnership
    /\ ValidTaskIds
    /\ NoDuplicateTaskIds
    /\ ValidStateTransitionsInvariant
    /\ ConsistentTimestamps
    /\ NoCyclicDependencies
    /\ AuthenticationRequired

\* Next state relation
Next ==
    \/ AdvanceTime
    \/ \E user \in Users : Authenticate(user)
    \/ Logout
    \/ \E t \in Titles, d \in Descriptions, p \in Priorities, 
         u \in Users, dd \in 0..MaxTime \cup {"NULL"},
         tags \in SUBSET {"bug", "feature", "enhancement", "documentation"},
         deps \in SUBSET DOMAIN tasks :
       CreateTask(t, d, p, u, dd, tags, deps)
    \/ \E taskId \in DOMAIN tasks, newStatus \in TaskStates :
       UpdateTaskStatus(taskId, newStatus)
    \/ \E taskId \in DOMAIN tasks, newPriority \in Priorities :
       UpdateTaskPriority(taskId, newPriority)
    \/ \E taskId \in DOMAIN tasks, newAssignee \in Users :
       ReassignTask(taskId, newAssignee)
    \/ \E taskId \in DOMAIN tasks, t \in Titles, 
         d \in Descriptions, dd \in 0..MaxTime \cup {"NULL"} :
       UpdateTaskDetails(taskId, t, d, dd)
    \/ \E taskId \in DOMAIN tasks : DeleteTask(taskId)
    \/ CheckDependencies
    \/ \E taskIds \in SUBSET DOMAIN tasks, newStatus \in TaskStates :
       taskIds # {} /\ BulkUpdateStatus(taskIds, newStatus)

\* Properties to check
THEOREM TypeCorrectness == Spec => []TypeInvariant
THEOREM SafetyHolds == Spec => []SafetyInvariant
THEOREM LivenessHolds == Spec => (EventualCompletion /\ FairProgress)
THEOREM NoDeadlock == Spec => []<>Next
THEOREM Termination == Spec => <>(\A taskId \in DOMAIN tasks : 
                                    tasks[taskId].status \in {"completed", "cancelled"})
=============================================================================

Step 4: Model Checking and Trace Generation

Now we can run TLA+ model checking to verify our specification (TaskManagement.cfg):

\* Model configuration for TaskManagementImproved module
SPECIFICATION Spec

\* Constants definition
CONSTANTS
    Users = {alice, bob, charlie}
    MaxTasks = 5
    MaxTime = 20
    Titles = {task1, task2, task3, task4, task5}
    Descriptions = {desc1, desc2, desc3}

\* Model values for special constants
CONSTANT
    NULL = NULL
    EMPTY_STRING = EMPTY_STRING

\* Initial state constraint
CONSTRAINT
    /\ nextTaskId <= MaxTasks + 1
    /\ clock <= MaxTime
    /\ Cardinality(DOMAIN tasks) <= MaxTasks

\* State space reduction (optional, for faster checking)
ACTION_CONSTRAINT
    \* Limit number of active sessions
    /\ Cardinality({u \in Users : sessions[u] = TRUE}) <= 2
    \* Prevent creating too many tasks at once
    /\ nextTaskId <= MaxTasks

\* Invariants to check
INVARIANT TypeInvariant
INVARIANT SafetyInvariant
INVARIANT NoOrphanTasks
INVARIANT TaskOwnership
INVARIANT ValidTaskIds
INVARIANT NoDuplicateTaskIds
INVARIANT ValidStateTransitionsInvariant
INVARIANT ConsistentTimestamps
INVARIANT NoCyclicDependencies
INVARIANT AuthenticationRequired

\* Properties to check
PROPERTY EventualCompletion
PROPERTY FairProgress
PROPERTY EventualUnblocking
PROPERTY EventualAuthentication
PROPERTY NoStarvation

\* Check for deadlocks
CHECK_DEADLOCK TRUE

\* View for debugging (optional)
VIEW <<nextTaskId, Cardinality(DOMAIN tasks), clock>>

\* Alias for better state visualization
ALIAS TaskSummary == [
    totalTasks |-> Cardinality(DOMAIN tasks),
    pendingTasks |-> Cardinality({t \in DOMAIN tasks : tasks[t].status = "pending"}),
    inProgressTasks |-> Cardinality({t \in DOMAIN tasks : tasks[t].status = "in_progress"}),
    completedTasks |-> Cardinality({t \in DOMAIN tasks : tasks[t].status = "completed"}),
    blockedTasks |-> Cardinality({t \in DOMAIN tasks : tasks[t].status = "blocked"}),
    activeSessions |-> {u \in Users : sessions[u] = TRUE},
    currentUser |-> currentUser,
    time |-> clock
]

# Run the model checker
java -jar tla2tools.jar -config TaskManagement.cfg TaskManagement.tla

TLA+ will explore all possible execution paths and either:

Verify that all properties hold
Generate counterexamples showing property violations

Step 5: Claude Implementation with TLA+ Context

Now comes the magic. Instead of vague natural language, we give Claude the precise TLA+ specification:

My prompt to Claude:

Using this TLA+ specification for a task management system, implement a Go REST API 
that satisfies all the specified behaviors and properties.

[Include the full TLA+ specification here]

Requirements:
1. Implement all actions defined in the TLA+ spec
2. Ensure all safety properties hold at runtime
3. Use Go's type system to enforce the data structures
4. Include comprehensive tests that cover the execution traces
5. Add proper error handling for violation of preconditions

Focus on behavioral correctness over performance optimization.

Claude generates something like (task.go):

// Package domain contains the core business entities matching the TLA+ specification
package domain

import (
	"fmt"
	"time"
)

// TaskID represents a unique task identifier (maps to TLA+ task ID)
type TaskID int

// UserID represents a user identifier (maps to TLA+ Users set)
type UserID string

// TaskStatus represents the state of a task (maps to TLA+ TaskStates)
type TaskStatus string

const (
	StatusPending    TaskStatus = "pending"
	StatusInProgress TaskStatus = "in_progress"
	StatusCompleted  TaskStatus = "completed"
	StatusCancelled  TaskStatus = "cancelled"
	StatusBlocked    TaskStatus = "blocked"
)

// Priority represents task priority levels (maps to TLA+ Priorities)
type Priority string

const (
	PriorityLow      Priority = "low"
	PriorityMedium   Priority = "medium"
	PriorityHigh     Priority = "high"
	PriorityCritical Priority = "critical"
)

// Tag represents task categories (maps to TLA+ tags subset)
type Tag string

const (
	TagBug           Tag = "bug"
	TagFeature       Tag = "feature"
	TagEnhancement   Tag = "enhancement"
	TagDocumentation Tag = "documentation"
)

// Task represents a task entity (maps to TLA+ task record)
type Task struct {
	ID           TaskID            `json:"id"`
	Title        string            `json:"title"`
	Description  string            `json:"description"`
	Status       TaskStatus        `json:"status"`
	Priority     Priority          `json:"priority"`
	Assignee     UserID            `json:"assignee"`
	CreatedBy    UserID            `json:"created_by"`
	CreatedAt    time.Time         `json:"created_at"`
	UpdatedAt    time.Time         `json:"updated_at"`
	DueDate      *time.Time        `json:"due_date,omitempty"`
	Tags         []Tag             `json:"tags"`
	Dependencies map[TaskID]bool   `json:"dependencies"`
}

// ValidTransition represents a valid state transition (maps to TLA+ ValidTransitions)
type ValidTransition struct {
	From TaskStatus
	To   TaskStatus
}

// ValidTransitions defines all allowed state transitions
var ValidTransitions = map[ValidTransition]bool{
	{StatusPending, StatusInProgress}:    true,
	{StatusPending, StatusCancelled}:     true,
	{StatusPending, StatusBlocked}:       true,
	{StatusInProgress, StatusCompleted}:  true,
	{StatusInProgress, StatusCancelled}:  true,
	{StatusInProgress, StatusBlocked}:    true,
	{StatusInProgress, StatusPending}:    true, // Allow reverting
	{StatusBlocked, StatusPending}:       true,
	{StatusBlocked, StatusInProgress}:    true,
	{StatusBlocked, StatusCancelled}:     true,
}

// IsValidTransition checks if a state transition is valid (maps to TLA+ IsValidTransition)
func IsValidTransition(from, to TaskStatus) bool {
	return ValidTransitions[ValidTransition{From: from, To: to}]
}

// CanDelete checks if a task can be deleted (only completed or cancelled)
func (t *Task) CanDelete() bool {
	return t.Status == StatusCompleted || t.Status == StatusCancelled
}

// IsBlocked checks if task should be blocked based on dependencies
func (t *Task) IsBlocked(allTasks map[TaskID]*Task) bool {
	if len(t.Dependencies) == 0 {
		return false
	}
	
	for depID := range t.Dependencies {
		if dep, exists := allTasks[depID]; exists {
			if dep.Status != StatusCompleted {
				return true
			}
		}
	}
	return false
}

// ShouldUnblock checks if a blocked task can be unblocked
func (t *Task) ShouldUnblock(allTasks map[TaskID]*Task) bool {
	if t.Status != StatusBlocked {
		return false
	}
	
	for depID := range t.Dependencies {
		if dep, exists := allTasks[depID]; exists {
			if dep.Status != StatusCompleted {
				return false
			}
		}
	}
	return true
}

// Validate performs domain validation on the task
func (t *Task) Validate() error {
	if t.Title == "" {
		return fmt.Errorf("task title cannot be empty")
	}
	if t.Description == "" {
		return fmt.Errorf("task description cannot be empty")
	}
	if !isValidStatus(t.Status) {
		return fmt.Errorf("invalid task status: %s", t.Status)
	}
	if !isValidPriority(t.Priority) {
		return fmt.Errorf("invalid task priority: %s", t.Priority)
	}
	if t.Assignee == "" {
		return fmt.Errorf("task must have an assignee")
	}
	if t.CreatedBy == "" {
		return fmt.Errorf("task must have a creator")
	}
	if t.CreatedAt.After(t.UpdatedAt) {
		return fmt.Errorf("created time cannot be after updated time")
	}
	for _, tag := range t.Tags {
		if !isValidTag(tag) {
			return fmt.Errorf("invalid tag: %s", tag)
		}
	}
	return nil
}

func isValidStatus(status TaskStatus) bool {
	switch status {
	case StatusPending, StatusInProgress, StatusCompleted, StatusCancelled, StatusBlocked:
		return true
	default:
		return false
	}
}

func isValidPriority(priority Priority) bool {
	switch priority {
	case PriorityLow, PriorityMedium, PriorityHigh, PriorityCritical:
		return true
	default:
		return false
	}
}

func isValidTag(tag Tag) bool {
	switch tag {
	case TagBug, TagFeature, TagEnhancement, TagDocumentation:
		return true
	default:
		return false
	}
}

// Package usecase implements the TLA+ actions as use cases
package usecase

import (
	"crypto/rand"
	"encoding/hex"
	"fmt"
	"time"
	
	"github.com/bhatti/sample-task-management/internal/domain"
	"github.com/bhatti/sample-task-management/internal/repository"
)

// TaskUseCase implements task-related TLA+ actions
type TaskUseCase struct {
	uow              repository.UnitOfWork
	invariantChecker InvariantChecker
}

// InvariantChecker interface for runtime invariant validation
type InvariantChecker interface {
	CheckAllInvariants(state *domain.SystemState) error
	CheckTaskInvariants(task *domain.Task, state *domain.SystemState) error
	CheckTransitionInvariant(from, to domain.TaskStatus) error
}

// NewTaskUseCase creates a new task use case
func NewTaskUseCase(uow repository.UnitOfWork, checker InvariantChecker) *TaskUseCase {
	return &TaskUseCase{
		uow:              uow,
		invariantChecker: checker,
	}
}

// Authenticate implements TLA+ Authenticate action
func (uc *TaskUseCase) Authenticate(userID domain.UserID) (*domain.Session, error) {
	// Preconditions from TLA+:
	// - user \in Users
	// - ~sessions[user]
	
	user, err := uc.uow.Users().GetUser(userID)
	if err != nil {
		return nil, fmt.Errorf("user not found: %w", err)
	}
	
	// Check if user already has an active session
	existingSession, _ := uc.uow.Sessions().GetSessionByUser(userID)
	if existingSession != nil && existingSession.IsValid() {
		return nil, fmt.Errorf("user %s already has an active session", userID)
	}
	
	// Create new session
	token := generateToken()
	session := &domain.Session{
		UserID:    user.ID,
		Token:     token,
		Active:    true,
		CreatedAt: time.Now(),
		ExpiresAt: time.Now().Add(24 * time.Hour),
	}
	
	// Update state
	if err := uc.uow.Sessions().CreateSession(session); err != nil {
		return nil, fmt.Errorf("failed to create session: %w", err)
	}
	
	if err := uc.uow.SystemState().SetCurrentUser(&userID); err != nil {
		return nil, fmt.Errorf("failed to set current user: %w", err)
	}
	
	// Check invariants
	state, _ := uc.uow.SystemState().GetSystemState()
	if err := uc.invariantChecker.CheckAllInvariants(state); err != nil {
		uc.uow.Rollback()
		return nil, fmt.Errorf("invariant violation: %w", err)
	}
	
	return session, nil
}


// CreateTask implements TLA+ CreateTask action
func (uc *TaskUseCase) CreateTask(
	title, description string,
	priority domain.Priority,
	assignee domain.UserID,
	dueDate *time.Time,
	tags []domain.Tag,
	dependencies []domain.TaskID,
) (*domain.Task, error) {
	// Preconditions from TLA+:
	// - currentUser # NULL
	// - currentUser \in Users
	// - nextTaskId <= MaxTasks
	// - deps \subseteq DOMAIN tasks
	// - \A dep \in deps : tasks[dep].status # "cancelled"
	
	currentUser, err := uc.uow.SystemState().GetCurrentUser()
	if err != nil || currentUser == nil {
		return nil, fmt.Errorf("authentication required")
	}
	
	// Check max tasks limit
	nextID, err := uc.uow.SystemState().GetNextTaskID()
	if err != nil {
		return nil, fmt.Errorf("failed to get next task ID: %w", err)
	}
	
	if nextID > domain.MaxTasks {
		return nil, fmt.Errorf("maximum number of tasks (%d) reached", domain.MaxTasks)
	}
	
	// Validate dependencies
	allTasks, err := uc.uow.Tasks().GetAllTasks()
	if err != nil {
		return nil, fmt.Errorf("failed to get tasks: %w", err)
	}
	
	depMap := make(map[domain.TaskID]bool)
	for _, depID := range dependencies {
		depTask, exists := allTasks[depID]
		if !exists {
			return nil, fmt.Errorf("dependency task %d does not exist", depID)
		}
		if depTask.Status == domain.StatusCancelled {
			return nil, fmt.Errorf("cannot depend on cancelled task %d", depID)
		}
		depMap[depID] = true
	}
	
	// Check for cyclic dependencies
	if err := uc.checkCyclicDependencies(nextID, depMap, allTasks); err != nil {
		return nil, err
	}
	
	// Determine initial status based on dependencies
	status := domain.StatusPending
	if len(dependencies) > 0 {
		// Check if all dependencies are completed
		allCompleted := true
		for depID := range depMap {
			if allTasks[depID].Status != domain.StatusCompleted {
				allCompleted = false
				break
			}
		}
		if !allCompleted {
			status = domain.StatusBlocked
		}
	}
	
	// Create task
	task := &domain.Task{
		ID:           nextID,
		Title:        title,
		Description:  description,
		Status:       status,
		Priority:     priority,
		Assignee:     assignee,
		CreatedBy:    *currentUser,
		CreatedAt:    time.Now(),
		UpdatedAt:    time.Now(),
		DueDate:      dueDate,
		Tags:         tags,
		Dependencies: depMap,
	}
	
	// Validate task
	if err := task.Validate(); err != nil {
		return nil, fmt.Errorf("task validation failed: %w", err)
	}
	
	// Save task
	if err := uc.uow.Tasks().CreateTask(task); err != nil {
		return nil, fmt.Errorf("failed to create task: %w", err)
	}
	
	// Increment next task ID
	if _, err := uc.uow.SystemState().IncrementNextTaskID(); err != nil {
		return nil, fmt.Errorf("failed to increment task ID: %w", err)
	}
	
	// Check invariants
	state, _ := uc.uow.SystemState().GetSystemState()
	if err := uc.invariantChecker.CheckAllInvariants(state); err != nil {
		uc.uow.Rollback()
		return nil, fmt.Errorf("invariant violation after task creation: %w", err)
	}
	
	return task, nil
}

// UpdateTaskStatus implements TLA+ UpdateTaskStatus action
func (uc *TaskUseCase) UpdateTaskStatus(taskID domain.TaskID, newStatus domain.TaskStatus) error {
	// Preconditions from TLA+:
	// - currentUser # NULL
	// - TaskExists(taskId)
	// - taskId \in GetUserTasks(currentUser)
	// - IsValidTransition(tasks[taskId].status, newStatus)
	// - newStatus = "in_progress" => all dependencies completed
	
	currentUser, err := uc.uow.SystemState().GetCurrentUser()
	if err != nil || currentUser == nil {
		return fmt.Errorf("authentication required")
	}
	
	task, err := uc.uow.Tasks().GetTask(taskID)
	if err != nil {
		return fmt.Errorf("task not found: %w", err)
	}
	
	// Check user owns the task
	userTasks, err := uc.uow.SystemState().GetUserTasks(*currentUser)
	if err != nil {
		return fmt.Errorf("failed to get user tasks: %w", err)
	}
	
	hasTask := false
	for _, id := range userTasks {
		if id == taskID {
			hasTask = true
			break
		}
	}
	
	if !hasTask {
		return fmt.Errorf("user does not have access to task %d", taskID)
	}
	
	// Check valid transition
	if !domain.IsValidTransition(task.Status, newStatus) {
		return fmt.Errorf("invalid transition from %s to %s", task.Status, newStatus)
	}
	
	// Check dependencies if moving to in_progress
	if newStatus == domain.StatusInProgress {
		allTasks, _ := uc.uow.Tasks().GetAllTasks()
		for depID := range task.Dependencies {
			if depTask, exists := allTasks[depID]; exists {
				if depTask.Status != domain.StatusCompleted {
					return fmt.Errorf("cannot start task: dependency %d is not completed", depID)
				}
			}
		}
	}
	
	// Update status
	task.Status = newStatus
	task.UpdatedAt = time.Now()
	
	if err := uc.uow.Tasks().UpdateTask(task); err != nil {
		return fmt.Errorf("failed to update task: %w", err)
	}
	
	// Check invariants
	state, _ := uc.uow.SystemState().GetSystemState()
	if err := uc.invariantChecker.CheckAllInvariants(state); err != nil {
		uc.uow.Rollback()
		return fmt.Errorf("invariant violation: %w", err)
	}
	
	return nil
}
...

Step 6: TLA+ Generated Tests

The real power comes when we use TLA+ execution traces to generate comprehensive tests:

My prompt to Claude:

Generate Go tests that verify the implementation satisfies the TLA+ specification.
Create test cases that:

1. Test all TLA+ actions with valid preconditions
2. Test safety property violations
3. Test edge cases from the TLA+ model boundary conditions
4. Use property-based testing where appropriate

Include tests that would catch the execution traces TLA+ model checker explores.

Claude generates (invariant_test.go):

// Package property implements property-based tests for TLA+ invariants
package property

import (
	"math/rand"
	"testing"
	"time"

	"github.com/bhatti/sample-task-management/internal/domain"
	"github.com/bhatti/sample-task-management/internal/infrastructure/memory"
	"github.com/bhatti/sample-task-management/internal/usecase"
	"github.com/bhatti/sample-task-management/pkg/invariants"
	"github.com/stretchr/testify/assert"
	"github.com/stretchr/testify/require"
)

// TestInvariantsHoldAfterOperations verifies invariants hold after each operation
func TestInvariantsHoldAfterOperations(t *testing.T) {
	repo := memory.NewMemoryRepository()
	uow := memory.NewMemoryUnitOfWork(repo)
	checker := invariants.NewInvariantChecker()
	uc := usecase.NewTaskUseCase(uow, checker)

	// Setup initial users
	users := []domain.UserID{"alice", "bob", "charlie"}
	for _, userID := range users {
		user := &domain.User{
			ID:       userID,
			Name:     string(userID),
			Email:    string(userID) + "@example.com",
			JoinedAt: time.Now(),
		}
		require.NoError(t, repo.CreateUser(user))
	}

	// Property: Invariants hold after authentication
	t.Run("InvariantsAfterAuthentication", func(t *testing.T) {
		for _, userID := range users {
			session, err := uc.Authenticate(userID)
			assert.NoError(t, err)
			assert.NotNil(t, session)

			state, _ := repo.GetSystemState()
			assert.NoError(t, checker.CheckAllInvariants(state))

			// Cleanup
			_ = uc.Logout(userID)
		}
	})

	// Property: Invariants hold after task creation
	t.Run("InvariantsAfterTaskCreation", func(t *testing.T) {
		uc.Authenticate("alice")

		for i := 0; i < 10; i++ {
			task, err := uc.CreateTask(
				"Task "+string(rune(i)),
				"Description",
				randomPriority(),
				randomUser(users),
				randomDueDate(),
				randomTags(),
				[]domain.TaskID{}, // No dependencies initially
			)

			assert.NoError(t, err)
			assert.NotNil(t, task)

			state, _ := repo.GetSystemState()
			assert.NoError(t, checker.CheckAllInvariants(state))
		}
	})

	// Property: Invariants hold after status transitions
	t.Run("InvariantsAfterStatusTransitions", func(t *testing.T) {
		uc.Authenticate("alice")

		// Create a task
		task, _ := uc.CreateTask(
			"Test Task",
			"Description",
			domain.PriorityMedium,
			"alice",
			nil,
			[]domain.Tag{domain.TagFeature},
			[]domain.TaskID{},
		)

		// Valid transitions
		validTransitions := []domain.TaskStatus{
			domain.StatusInProgress,
			domain.StatusCompleted,
		}

		for _, status := range validTransitions {
			err := uc.UpdateTaskStatus(task.ID, status)
			if err == nil {
				state, _ := repo.GetSystemState()
				assert.NoError(t, checker.CheckAllInvariants(state))
			}
		}
	})

	// Property: No cyclic dependencies can be created
	t.Run("NoCyclicDependencies", func(t *testing.T) {
		uc.Authenticate("alice")

		// Create tasks with potential cycles
		task1, _ := uc.CreateTask("Task1", "Desc", domain.PriorityLow, "alice", nil, nil, []domain.TaskID{})
		task2, _ := uc.CreateTask("Task2", "Desc", domain.PriorityLow, "alice", nil, nil, []domain.TaskID{task1.ID})
		task3, _ := uc.CreateTask("Task3", "Desc", domain.PriorityLow, "alice", nil, nil, []domain.TaskID{task2.ID})

		// Attempting to create a cycle should fail
		_, err := uc.CreateTask("Task4", "Desc", domain.PriorityLow, "alice", nil, nil,
			[]domain.TaskID{task3.ID, task1.ID}) // This would create a cycle
		assert.NoError(t, err)

		// Even if it doesn't fail explicitly, invariants should catch it
		state, _ := repo.GetSystemState()
		assert.NoError(t, checker.CheckAllInvariants(state))
	})
}

// TestTransitionInvariants tests state transition validity
func TestTransitionInvariants(t *testing.T) {
	checker := invariants.NewInvariantChecker()

	// Test all valid transitions
	validTransitions := []struct {
		from domain.TaskStatus
		to   domain.TaskStatus
	}{
		{domain.StatusPending, domain.StatusInProgress},
		{domain.StatusPending, domain.StatusCancelled},
		{domain.StatusInProgress, domain.StatusCompleted},
		{domain.StatusInProgress, domain.StatusCancelled},
		{domain.StatusBlocked, domain.StatusPending},
		{domain.StatusBlocked, domain.StatusCancelled},
	}

	for _, trans := range validTransitions {
		t.Run(string(trans.from)+"_to_"+string(trans.to), func(t *testing.T) {
			err := checker.CheckTransitionInvariant(trans.from, trans.to)
			assert.NoError(t, err)
		})
	}

	// Test invalid transitions
	invalidTransitions := []struct {
		from domain.TaskStatus
		to   domain.TaskStatus
	}{
		{domain.StatusCompleted, domain.StatusPending},
		{domain.StatusCompleted, domain.StatusInProgress},
		{domain.StatusCancelled, domain.StatusInProgress},
		{domain.StatusPending, domain.StatusCompleted}, // Must go through in_progress
	}

	for _, trans := range invalidTransitions {
		t.Run("Invalid_"+string(trans.from)+"_to_"+string(trans.to), func(t *testing.T) {
			err := checker.CheckTransitionInvariant(trans.from, trans.to)
			assert.Error(t, err)
		})
	}
}

// TestPropertyTaskOwnership verifies task ownership invariants
func TestPropertyTaskOwnership(t *testing.T) {
	repo := memory.NewMemoryRepository()
	uow := memory.NewMemoryUnitOfWork(repo)
	checker := invariants.NewInvariantChecker()
	uc := usecase.NewTaskUseCase(uow, checker)

	// Setup users
	users := []domain.UserID{"alice", "bob"}
	for _, userID := range users {
		user := &domain.User{
			ID:       userID,
			Name:     string(userID),
			Email:    string(userID) + "@example.com",
			JoinedAt: time.Now(),
		}
		repo.CreateUser(user)
	}

	// Property: Task reassignment maintains ownership invariants
	t.Run("ReassignmentMaintainsOwnership", func(t *testing.T) {
		uc.Authenticate("alice")

		// Create task assigned to Alice
		task, err := uc.CreateTask(
			"Test Task",
			"Description",
			domain.PriorityHigh,
			"alice",
			nil,
			[]domain.Tag{domain.TagBug},
			[]domain.TaskID{},
		)
		require.NoError(t, err)

		// Check initial ownership
		state, _ := repo.GetSystemState()
		assert.NoError(t, checker.CheckAllInvariants(state))

		aliceTasks := state.GetUserTasks("alice")
		assert.Contains(t, aliceTasks, task.ID)

		// Reassign to Bob
		err = uc.ReassignTask(task.ID, "bob")
		require.NoError(t, err)

		// Check ownership after reassignment
		state, _ = repo.GetSystemState()
		assert.NoError(t, checker.CheckAllInvariants(state))

		aliceTasks = state.GetUserTasks("alice")
		bobTasks := state.GetUserTasks("bob")
		assert.NotContains(t, aliceTasks, task.ID)
		assert.Contains(t, bobTasks, task.ID)
	})
}

// TestPropertyConcurrentOperations tests invariants under concurrent operations
func TestPropertyConcurrentOperations(t *testing.T) {
	repo := memory.NewMemoryRepository()
	uow := memory.NewMemoryUnitOfWork(repo)
	checker := invariants.NewInvariantChecker()

	// Setup users
	users := []domain.UserID{"user1", "user2", "user3"}
	for _, userID := range users {
		user := &domain.User{
			ID:       userID,
			Name:     string(userID),
			Email:    string(userID) + "@example.com",
			JoinedAt: time.Now(),
		}
		repo.CreateUser(user)
	}

	// Run concurrent operations
	done := make(chan bool, len(users))

	for _, userID := range users {
		go func(uid domain.UserID) {
			uc := usecase.NewTaskUseCase(uow, checker)

			// Authenticate
			uc.Authenticate(uid)

			// Create multiple tasks
			for i := 0; i < 5; i++ {
				uc.CreateTask(
					"Task",
					"Description",
					randomPriority(),
					uid,
					nil,
					randomTags(),
					[]domain.TaskID{},
				)

				// Random delay
				time.Sleep(time.Duration(rand.Intn(10)) * time.Millisecond)
			}

			done <- true
		}(userID)
	}

	// Wait for all goroutines
	for i := 0; i < len(users); i++ {
		<-done
	}

	// Check invariants after concurrent operations
	state, _ := repo.GetSystemState()
	assert.NoError(t, checker.CheckAllInvariants(state))
}

// Helper functions

func randomPriority() domain.Priority {
	priorities := []domain.Priority{
		domain.PriorityLow,
		domain.PriorityMedium,
		domain.PriorityHigh,
		domain.PriorityCritical,
	}
	return priorities[rand.Intn(len(priorities))]
}

func randomUser(users []domain.UserID) domain.UserID {
	return users[rand.Intn(len(users))]
}

func randomDueDate() *time.Time {
	if rand.Float32() < 0.5 {
		return nil
	}
	due := time.Now().Add(time.Duration(rand.Intn(30)) * 24 * time.Hour)
	return &due
}

func randomTags() []domain.Tag {
	allTags := []domain.Tag{
		domain.TagBug,
		domain.TagFeature,
		domain.TagEnhancement,
		domain.TagDocumentation,
	}

	numTags := rand.Intn(len(allTags) + 1)
	if numTags == 0 {
		return nil
	}

	tags := make([]domain.Tag, 0, numTags)
	used := make(map[domain.Tag]bool)

	for len(tags) < numTags {
		tag := allTags[rand.Intn(len(allTags))]
		if !used[tag] {
			tags = append(tags, tag)
			used[tag] = true
		}
	}

	return tags
}

Advanced TLA+ Patterns with Claude

Modeling Concurrent Operations

One of TLA+’s strengths is modeling concurrent systems. Let’s extend our specification to handle concurrent task updates:

\* Concurrent task updates with conflict resolution
ConcurrentUpdateTask(taskId, newStatus, version) ==
    /\ currentUser # NULL
    /\ taskId \in DOMAIN tasks
    /\ taskId \in userTasks[currentUser]
    /\ tasks[taskId].version = version  \* Optimistic concurrency control
    /\ tasks' = [tasks EXCEPT ![taskId] = [
                     @ EXCEPT 
                     !.status = newStatus,
                     !.version = @ + 1,
                     !.lastModified = currentUser
                 ]]
    /\ UNCHANGED <<userTasks, nextTaskId, currentUser>>

Prompt to Claude:

Implement optimistic concurrency control for the task updates based on this 
TLA+ specification. Include version tracking and conflict detection.

Modeling Complex Business Rules

TLA+ excels at capturing complex business logic:

\* Business rule: High priority tasks cannot be cancelled directly
ValidStatusTransition(currentStatus, newStatus, priority) ==
    \/ newStatus = currentStatus
    \/ /\ currentStatus = "pending" 
       /\ newStatus \in {"in_progress", "cancelled"}
    \/ /\ currentStatus = "in_progress"
       /\ newStatus \in {"completed", "pending"}
    \/ /\ currentStatus = "in_progress"
       /\ newStatus = "cancelled"
       /\ priority # "high"  \* High priority tasks cannot be cancelled

Lessons Learned

After applying this TLA+ approach to several experimental projects, here are the key insights:

1. Start Small

Begin with core actions and properties. TLA+ specifications can grow complex quickly, so start with the essential behaviors:

\* Start with basic CRUD
Init, CreateTask, UpdateTask, DeleteTask

\* Add complexity incrementally  
Authentication, Authorization, Concurrency, Business Rules

Avoid Initially: Complex distributed systems, performance-critical algorithms

Graduate To: Multi-service interactions, complex business logic

2. Properties Drive Design

Writing TLA+ properties often reveals design flaws before implementation:

\* This property might fail, revealing a design issue
ConsistencyProperty == 
    \A user \in Users:
        \A taskId \in userTasks[user]:
            /\ taskId \in DOMAIN tasks
            /\ tasks[taskId].assignee = user
            /\ tasks[taskId].status # "deleted"  \* Soft delete consideration

3. Model Checking Finds Edge Cases

TLA+ model checking explores execution paths you’d never think to test:

# TLA+ finds this counterexample:
# Step 1: User1 creates Task1
# Step 2: User1 deletes Task1  
# Step 3: User2 creates Task2 (gets same ID due to reuse)
# Step 4: User1 tries to update Task1 -> Security violation!

This led to using UUIDs instead of incrementing integers for task IDs.

4. Generated Tests Are Comprehensive

TLA+ execution traces become your regression test suite. When Claude implements based on TLA+ specs, you get:

Complete coverage – All specification paths tested
Edge case detection – Boundary conditions from model checking
Behavioral contracts – Tests verify actual system properties

Documentation Generation

Prompt to Claude:

Generate API documentation from this TLA+ specification that includes:
1. Endpoint descriptions derived from TLA+ actions
2. Request/response schemas from TLA+ data structures  
3. Error conditions from TLA+ preconditions
4. Behavioral guarantees from TLA+ properties

Code Review Guidelines

With TLA+ specifications, code reviews become more focused:

Does implementation satisfy the TLA+ spec?
Are all preconditions checked?
Do safety properties hold?
Are error conditions handled as specified?

Comparing Specification Approaches

Approach	Precision	AI Effectiveness	Maintenance	Learning Curve	Tool Complexity	Code Generation
Vibe Coding	Low	Inconsistent	High	Low	Low	N/A
UML/MDD	Medium	Poor	Very High	High	Very High	Brittle
BDD/Gherkin	Medium	Better	Medium	Medium	Low	Limited
TLA+ Specs	High	Excellent	Low	High	Low	Reliable

Tools and Resources

Essential TLA+ Resources

Learn TLA+: https://learntla.com – Interactive tutorial
TLA+ Video Course: Leslie Lamport’s official course
Practical TLA+: Hillel Wayne’s book – focus on software systems
TLA+ Examples: https://github.com/tlaplus/Examples

Common Mistakes

1. Avoid These Mistakes

? Writing TLA+ like code

\* Wrong - this looks like pseudocode
CreateTask == 
    if currentUser != null then
        task = new Task()

? Writing TLA+ as mathematical relations

\* Right - mathematical specification  
CreateTask == 
    /\ currentUser # NULL
    /\ tasks' = tasks @@ (nextTaskId :> newTask)

? Asking Claude to “fix the TLA+ to match the code”

The spec is the truth – fix the code to match the spec

? Asking Claude to “implement this TLA+ specification correctly”

? Specification scope creep: Starting with entire system architecture ? Incremental approach: Begin with one core workflow, expand gradually

2. Claude Integration Pitfalls

? “Fix the spec to match my code”: Treating specifications as documentation ? “Fix the code to match the spec”: Specifications are the source of truth

3. The Context Overload Trap

Problem: Dumping too much information at once
Solution: Break complex features into smaller, focused requests

4. The “Fix My Test” Antipattern

Problem: When tests fail, asking Claude to modify the test instead of the code
Solution: Always fix the implementation, not the test (unless the test is genuinely wrong)

5. The Blind Trust Mistake

Problem: Accepting generated code without understanding it
Solution: Always review and understand the code before committing

Proven Patterns

1. Save effective prompts:

# ~/.claude/tla-prompts/implementation.md
Implement [language] code that satisfies this TLA+ specification:

[SPEC]

Requirements:
- All TLA+ actions become functions/methods
- All preconditions become runtime checks  
- All data structures match TLA+ types
- Include comprehensive tests covering specification traces

Create specification templates:

--------------------------- MODULE [ModuleName] ---------------------------
EXTENDS Integers, Sequences, FiniteSets

CONSTANTS [Constants]

VARIABLES [StateVariables]

[TypeDefinitions]

Init == [InitialConditions]

[Actions]

Next == [ActionDisjunction]

Spec == Init /\ [][Next]_[StateVariables]

[SafetyProperties]

[LivenessProperties]

=============================================================================

2. The “Explain First” Pattern

Before asking Claude to implement something complex, I ask for an explanation:

Explain how you would implement real-time task updates using WebSockets. 
What are the trade-offs between Socket.io and native WebSockets?
What state management challenges should I consider?

3. The “Progressive Enhancement” Pattern

Start simple, then add complexity:

1. First: "Create a basic task model with CRUD operations"
2. Then: "Add validation and error handling"
3. Then: "Add authentication and authorization"
4. Finally: "Add real-time updates and notifications"

4. The “Code Review” Pattern

After implementation, I ask Claude to review its own code:

Review the task API implementation for:
- Security vulnerabilities
- Performance issues
- Code style consistency
- Missing error cases

Be critical and suggest improvements.

What’s Next

As I’ve developed this TLA+/Claude workflow, I’ve realized we’re approaching something profound: specifications as the primary artifact. Instead of writing code and hoping it’s correct, we’re defining correct behavior formally and letting AI generate the implementation. This inverts the traditional relationship between specification and code.

Implications for Software Engineering

Design-first development becomes natural
Bug prevention replaces bug fixing
Refactoring becomes re-implementation from stable specs
Documentation is always up-to-date (it’s the spec)

I’m currently experimenting with:

TLA+ to test case generation – Automated comprehensive testing
Multi-language implementations – Same spec, different languages
Specification composition – Building larger systems from verified components
Quint specifications – A modern executable specification language with simpler syntax than TLA+

Conclusion: The End of Vibe Coding

After using TLA+ with Claude, I can’t go back to vibe coding. The precision, reliability, and confidence that comes from executable specifications has transformed how I build software. The complete working example—TLA+ specs, Go implementation, comprehensive tests, and CI/CD pipeline—is available at github.com/bhatti/sample-task-management.

Yes, there’s a learning curve. Yes, writing TLA+ specifications takes time upfront. But the payoff—in terms of correctness, maintainability, and development speed—is extraordinary. Claude becomes not just a code generator, but a reliable engineering partner that can reason about complex systems precisely because we’ve given it precise specifications to work from. We’re moving from “code and hope” to “specify and know”—and that changes everything.

Shahzad Bhatti Welcome to my ramblings and rants!

August 25, 2025