Guide for Legacy Modernization Engineers

Complete walkthrough for modernizing legacy Python code with SpecFact CLI

Your Challenge

You’re responsible for modernizing a legacy Python system that:

Has minimal or no documentation
Was built by developers who have left
Contains critical business logic you can’t risk breaking
Needs migration to modern Python, cloud infrastructure, or microservices

Sound familiar? You’re not alone. 70% of IT budgets are consumed by legacy maintenance, and the legacy modernization market is $25B+ and growing.

SpecFact for Brownfield: Your Safety Net

SpecFact CLI is designed specifically for your situation. It provides:

Automated spec extraction (code2spec) - Understand what your code does in < 10 seconds
Runtime contract enforcement - Prevent regressions during modernization
Symbolic execution - Discover hidden edge cases with CrossHair
Formal guarantees - Mathematical verification, not probabilistic LLM suggestions
CLI-first integration - Works with VS Code, Cursor, GitHub Actions, pre-commit hooks, or any IDE. Works offline, no account required, no vendor lock-in.

Step 1: Understand What You Have

CLI-First Approach: SpecFact works offline, requires no account, and integrates with your existing workflow. Works with VS Code, Cursor, GitHub Actions, pre-commit hooks, or any IDE. No platform to learn, no vendor lock-in.

Extract Specs from Legacy Code

# Analyze your legacy codebase
specfact import from-code --bundle legacy-api --repo ./legacy-app

# For large codebases or multi-project repos, analyze specific modules:
specfact import from-code --bundle core-module --repo ./legacy-app --entry-point src/core
specfact import from-code --bundle api-module --repo ./legacy-app --entry-point src/api

What you get:

✅ Auto-generated feature map of existing functionality
✅ Extracted user stories from code patterns
✅ Dependency graph showing module relationships
✅ Business logic documentation from function signatures
✅ Edge cases discovered via symbolic execution

Example output:

✅ Analyzed 47 Python files
✅ Extracted 23 features:

   - FEATURE-001: User Authentication (95% confidence)
   - FEATURE-002: Payment Processing (92% confidence)
   - FEATURE-003: Order Management (88% confidence)
   ...
✅ Generated 112 user stories from existing code patterns
✅ Detected 6 edge cases with CrossHair symbolic execution
⏱️  Completed in 8.2 seconds

Time saved: 60-120 hours of manual documentation work → 8 seconds

💡 Partial Repository Coverage:

For large codebases or monorepos with multiple projects, you can analyze specific subdirectories using --entry-point:

# Analyze only the core module
specfact import from-code --bundle core-module --repo . --entry-point src/core

# Analyze only the API service
specfact import from-code --bundle api-service --repo . --entry-point projects/api-service

This enables:

Faster analysis - Focus on specific modules for quicker feedback
Incremental modernization - Modernize one module at a time
Multi-plan support - Create separate plan bundles for different projects/modules
Better organization - Keep plans organized by project boundaries

💡 Tip: After importing, the CLI may suggest generating a bootstrap constitution for Spec-Kit integration. This auto-generates a constitution from your repository analysis:

# If suggested, accept to auto-generate
# Or run manually:
specfact constitution bootstrap --repo .

This is especially useful if you plan to sync with Spec-Kit later.

Step 2: Add Contracts to Critical Paths

Identify Critical Functions

SpecFact helps you identify which functions are critical (high risk, high business value):

# Review extracted plan to identify critical paths
cat .specfact/projects/<bundle-name>/bundle.manifest.yaml

Add Runtime Contracts

Add contract decorators to critical functions:

# Before: Undocumented legacy function
def process_payment(user_id, amount, currency):
    # 80 lines of legacy code with hidden business rules
    ...

# After: Contract-enforced function
import icontract

@icontract.require(lambda amount: amount > 0, "Payment amount must be positive")
@icontract.require(lambda currency: currency in ['USD', 'EUR', 'GBP'])
@icontract.ensure(lambda result: result.status in ['SUCCESS', 'FAILED'])
def process_payment(user_id, amount, currency):
    # Same 80 lines of legacy code
    # Now with runtime enforcement
    ...

What this gives you:

✅ Runtime validation catches invalid inputs immediately
✅ Prevents regressions during refactoring
✅ Documents expected behavior (executable documentation)
✅ CrossHair discovers edge cases automatically

Step 3: Modernize with Confidence

Refactor Safely

With contracts in place, you can refactor knowing that violations will be caught:

# Refactored version (same contracts)
@icontract.require(lambda amount: amount > 0, "Payment amount must be positive")
@icontract.require(lambda currency: currency in ['USD', 'EUR', 'GBP'])
@icontract.ensure(lambda result: result.status in ['SUCCESS', 'FAILED'])
def process_payment(user_id, amount, currency):
    # Modernized implementation
    # If contract violated → exception raised immediately
    ...

Catch Regressions Automatically

# During modernization, accidentally break contract:
process_payment(user_id=-1, amount=-50, currency="XYZ")

# Runtime enforcement catches it:
# ❌ ContractViolation: Payment amount must be positive (got -50)
#    at process_payment() call from refactored checkout.py:142
#    → Prevented production bug during modernization!

Step 4: Discover Hidden Edge Cases

CrossHair Symbolic Execution

SpecFact uses CrossHair to discover edge cases that manual testing misses:

# Legacy function with hidden edge case
@icontract.require(lambda numbers: len(numbers) > 0)
@icontract.ensure(lambda numbers, result: len(numbers) == 0 or min(numbers) > result)
def remove_smallest(numbers: List[int]) -> int:
    """Remove and return smallest number from list"""
    smallest = min(numbers)
    numbers.remove(smallest)
    return smallest

# CrossHair finds counterexample:
# Input: [3, 3, 5] → After removal: [3, 5], min=3, returned=3
# ❌ Postcondition violated: min(numbers) > result fails when duplicates exist!
# CrossHair generates concrete failing input: [3, 3, 5]

Why this matters:

✅ Discovers edge cases LLMs miss
✅ Mathematical proof of violations (not probabilistic)
✅ Generates concrete test inputs automatically
✅ Prevents production bugs before they happen

Real-World Example: Django Legacy App

The Problem

You inherited a 3-year-old Django app with:

No documentation
No type hints
No tests
15 undocumented API endpoints
Business logic buried in views

The Solution

# Step 1: Extract specs
specfact import from-code --bundle customer-portal --repo ./legacy-django-app

# Output:
✅ Analyzed 47 Python files
✅ Extracted 23 features (API endpoints, background jobs, integrations)
✅ Generated 112 user stories from existing code patterns
✅ Time: 8 seconds

The Results

✅ Legacy app fully documented in < 10 minutes
✅ Prevented 4 production bugs during refactoring
✅ New developers onboard 60% faster
✅ CrossHair discovered 6 hidden edge cases

ROI: Time and Cost Savings

Manual Approach

Task	Time Investment	Cost (@$150/hr)
Manually document 50-file legacy app	80-120 hours	$12,000-$18,000
Write tests for undocumented code	100-150 hours	$15,000-$22,500
Debug regression during refactor	40-80 hours	$6,000-$12,000
TOTAL	220-350 hours	$33,000-$52,500

SpecFact Automated Approach

Task	Time Investment	Cost (@$150/hr)
Run code2spec extraction	10 minutes	$25
Review and refine extracted specs	8-16 hours	$1,200-$2,400
Add contracts to critical paths	16-24 hours	$2,400-$3,600
CrossHair edge case discovery	2-4 hours	$300-$600
TOTAL	26-44 hours	$3,925-$6,625

ROI: 87% time saved, $26,000-$45,000 cost avoided

Integration with Your Workflow

SpecFact CLI integrates seamlessly with your existing tools:

VS Code: Use pre-commit hooks to catch breaking changes before commit
Cursor: AI assistant workflows catch regressions during refactoring
GitHub Actions: CI/CD integration blocks bad code from merging
Pre-commit hooks: Local validation prevents breaking changes
Any IDE: Pure CLI-first approach—works with any editor

See real examples: Integration Showcases - 5 complete examples showing bugs fixed via integrations

Best Practices

1. Start with Shadow Mode

Begin in shadow mode to observe without blocking:

specfact import from-code --bundle legacy-api --repo . --shadow-only

2. Add Contracts Incrementally

Don’t try to contract everything at once:

Week 1: Add contracts to 3-5 critical functions
Week 2: Expand to 10-15 functions
Week 3: Add contracts to all public APIs
Week 4+: Add contracts to internal functions as needed

3. Use CrossHair for Edge Case Discovery

Run CrossHair on critical functions before refactoring:

hatch run contract-explore src/payment.py

4. Document Your Findings

Keep notes on:

Edge cases discovered
Contract violations caught
Time saved on documentation
Bugs prevented during modernization

Common Questions

Can SpecFact analyze code with no docstrings?

Yes. code2spec analyzes:

Function signatures and type hints
Code patterns and control flow
Existing validation logic
Module dependencies

No docstrings needed.

What if the legacy code has no type hints?

SpecFact infers types from usage patterns and generates specs. You can add type hints incrementally as part of modernization.

Can SpecFact handle obfuscated or minified code?

Limited. SpecFact works best with:

Source code (not compiled bytecode)
Readable variable names

For heavily obfuscated code, consider deobfuscation first.

Will contracts slow down my code?

Minimal impact. Contract checks are fast (microseconds per call). For high-performance code, you can disable contracts in production while keeping them in tests.

Next Steps

Integration Showcases - See real bugs fixed via VS Code, Cursor, GitHub Actions integrations
ROI Calculator - Calculate your time and cost savings
Brownfield Journey - Complete modernization workflow
Examples - Real-world brownfield examples
FAQ - More brownfield-specific questions

Support

Happy modernizing! 🚀