FLYNNCONCEIVABLE! Build Log

The Neural Network That Became a CPU - Development History

A FLYNNCOMM, LLC Production

"You keep using that transformer. I do not think it computes what you think it computes."

Yeah, it computes EXACTLY what we think it computes.

Project Summary

Metric	Value
Start Date	2024-12-12
Completion Date	2024-12-12
Neural Organs	6
Total Combinations Verified	460,928
Errors	0
Final Accuracy	100.000%

Milestone 1: ALU Organ

Status: ✅ COMPLETE (100% Accuracy)

The Challenge

The ALU performs addition and subtraction—the foundation of all computation. The challenge: teach a neural network to compute 8-bit arithmetic with carry, producing correct results AND correct status flags (N, V, Z, C) for all 131,072 possible inputs.

Architecture

Input Layer:
├── A register (Soroban encoded): 32 features
├── Operand (Soroban encoded): 32 features
└── Carry In: 1 feature
Total: 65 input features

Hidden Layers:
├── Linear(65 → 512) + ReLU
└── Linear(512 → 512) + ReLU

Output Heads:
├── Result Head: Linear(512 → 128 → 32) + Sigmoid
├── N Flag Head: Linear(512 → 32 → 1) + Sigmoid
├── Z Flag Head: Linear(512 → 32 → 1) + Sigmoid  [3x loss weight]
├── C Flag Head: Linear(512 → 32 → 1) + Sigmoid
└── V Flag Head: Linear(512 → 32 → 1) + Sigmoid

Soroban Encoding

Traditional binary struggles with carry propagation. Soroban (thermometer) encoding makes carries spatially visible:

Decimal 37 in Soroban (4 rods × 8 beads):
Rod 0 (1s):   ●●●●●●●○  = 7
Rod 1 (10s):  ●●●○○○○○  = 3
Rod 2 (100s): ○○○○○○○○  = 0
Rod 3 (1000s):○○○○○○○○  = 0

When adding, carry "ripples" through adjacent rod representations,
making it learnable as a spatial pattern.

VGem's Critical Fixes

Z-Flag Trap: Networks predict ~0.0001 instead of exactly 0
- Solution: Dedicated Z head with 3x loss weight
- Post-processing: If Z > 0.5, force result = 0
Carry Balance: Initial training had C_in=0 bias
- Solution: Enforce 50% C_in=1 in training data
Zero Oversampling: Result=0 cases are rare (512/131072 = 0.4%)
- Solution: 10x oversample zero-result cases

Training

Dataset: 131,072 exhaustive combinations (256 × 256 × 2)
Oversampled: ~300,000 with zero case emphasis
Epochs: 50
Batch Size: 2048
Optimizer: Adam (lr=0.001 → 0.0005)
Loss: BCE with 3x weight on Z flag

Verification

Tested: ALL 131,072 combinations
Errors: 0
Accuracy: 100.0000%

Milestone 2: SHIFT Organ

Status: ✅ COMPLETE (100% Accuracy)

Operations

ASL (Arithmetic Shift Left): Shift left, bit 7 → Carry
LSR (Logical Shift Right): Shift right, bit 0 → Carry
ROL (Rotate Left): Shift left, Carry → bit 0, bit 7 → Carry
ROR (Rotate Right): Shift right, Carry → bit 7, bit 0 → Carry

Architecture

Input: 13 features
├── Value (8 bits)
├── Carry In (1 bit)
└── Operation (4-bit one-hot)

Hidden: 256 → 256 → 128 (ReLU)

Output: 11 features
├── Result (8 bits)
├── N flag
├── Z flag
└── C flag

Training

Dataset: 1,536 unique combinations (256 × 2 × 4 - some C_in irrelevant)
Oversampled: 76,800 (50x)
Epochs: 20
Accuracy: 100%

Milestone 3: LOGIC Organ

Status: ✅ COMPLETE (100% Accuracy)

Operations

AND: Bitwise AND
ORA: Bitwise OR
EOR: Bitwise XOR (Exclusive OR)
BIT: Bit test (affects flags only)

Architecture

Input: 20 features
├── A register (8 bits)
├── Operand (8 bits)
└── Operation (4-bit one-hot)

Hidden: 256 → 256 → 128 (ReLU)

Output: 11 features
├── Result (8 bits)
├── N flag
├── Z flag
└── V flag (BIT only)

Training

Dataset: 262,144 exhaustive combinations (256 × 256 × 4)
Epochs: 50
Accuracy: 100%

Milestone 4: INCDEC Organ

Status: ✅ COMPLETE (100% Accuracy)

Operations

INC: Increment memory
DEC: Decrement memory
INX/INY: Increment X/Y register
DEX/DEY: Decrement X/Y register

Architecture

Input: 9 features
├── Value (8 bits)
└── Is Decrement (1 bit)

Hidden: 256 → 256 → 128 (ReLU)

Output: 10 features
├── Result (8 bits)
├── N flag
└── Z flag

Training

Dataset: 512 unique combinations (256 × 2)
Oversampled: 75,200 (100x + boundary emphasis)
Epochs: 20
Accuracy: 100%

Milestone 5: COMPARE Organ

Status: ✅ COMPLETE (100% Accuracy)

Operations

CMP: Compare A with memory
CPX: Compare X with memory
CPY: Compare Y with memory

Compare performs subtraction without storing result, only affecting flags:

N = bit 7 of (register - operand)
Z = 1 if register == operand
C = 1 if register >= operand

Architecture

Input: 16 features
├── Register value (8 bits)
└── Operand (8 bits)

Hidden: 512 → 256 → 128 (ReLU)

Output: 3 features
├── N flag
├── Z flag
└── C flag

Training

Dataset: 65,536 exhaustive combinations (256 × 256)
Epochs: 40
Accuracy: 100%

Milestone 6: BRANCH Organ

Status: ✅ COMPLETE (100% Accuracy)

Operations

BPL: Branch if Plus (N=0)
BMI: Branch if Minus (N=1)
BVC: Branch if Overflow Clear (V=0)
BVS: Branch if Overflow Set (V=1)
BCC: Branch if Carry Clear (C=0)
BCS: Branch if Carry Set (C=1)
BNE: Branch if Not Equal (Z=0)
BEQ: Branch if Equal (Z=1)

Architecture

Input: 12 features
├── N flag (1 bit)
├── V flag (1 bit)
├── Z flag (1 bit)
├── C flag (1 bit)
└── Branch type (8-bit one-hot)

Hidden: 64 → 32 (ReLU)

Output: 1 feature (take branch: yes/no)

Training

Dataset: 128 exhaustive combinations (16 flag states × 8 branch types)
Oversampled: 12,800 (100x)
Epochs: 10
Accuracy: 100%

Milestone 7: Full CPU Integration

Status: ✅ COMPLETE

Wiring

All neural organs integrated into the CPU class:

class CPU:
    def __init__(self, weights_dir):
        self.alu = ALUOrgan(hidden_dim=512)
        self.shift = ShiftOrgan()
        self.logic = LogicOrgan()
        self.incdec = IncDecOrgan()
        self.compare = CompareOrgan()
        self.branch = BranchOrgan()
        # Load pretrained weights...

Test Results

Test	Expected	Actual	Status
37 + 26	63	63	✅
0xFF AND 0x0F	0x0F	0x0F	✅
0xF0 ORA 0x0F	0xFF	0xFF	✅
0xFF EOR 0x0F	0xF0	0xF0	✅
0x40 ASL	0x80	0x80	✅
0x80 LSR	0x40	0x40	✅
0x80 ROL (C=1)	0x01	0x01	✅
0x01 ROR (C=1)	0x80	0x80	✅
5 INX INX	7	7	✅
16 DEY DEY DEY	13	13	✅
Fibonacci(10)	144	144	✅
7 × 13	91	91	✅

Final Statistics

Neural Organs

Organ	Parameters	Size	Combinations	Accuracy
ALU	~800K	1.7MB	131,072	100%
SHIFT	~200K	418KB	1,536	100%
LOGIC	~200K	425KB	262,144	100%
INCDEC	~200K	413KB	512	100%
COMPARE	~350K	696KB	65,536	100%
BRANCH	~5K	15KB	128	100%
Total	~1.75M	3.7MB	460,928	100%

Git History

f12b781 Add spectacular Neural 6502 demo
fb7c68b Wire up ALL neural organs - Full Neural 6502 operational!
29db39a Neural 6502: ALL 6 ORGANS AT 100% ACCURACY
63fe841 Add comprehensive training infrastructure and documentation
b4b7b1d Neural 6502: TRUE 100% ACCURACY
5242092 Neural 6502: First working version
dcebbf9 Add Neural 6502 spec sheet for VGem and Vi
35d34b8 Add Pretrained Neural 6502 model with support for two-be weights
4b2e91b Add Neural 6502 demo with training example
bb5eab1 Implement Neural 6502 CPU Emulator with specialized organs

File Structure

neural6502/
├── __init__.py              # Package initialization
├── cpu.py                   # Main CPU class (700+ lines)
├── memory.py                # 64KB RAM + memory-mapped I/O
├── soroban.py               # Thermometer encoding utilities
├── demo.py                  # Interactive demonstration
├── README.md                # User documentation
├── BUILD_LOG.md             # This file
├── organs/
│   ├── __init__.py          # Organ exports
│   ├── alu.py               # Neural ALU (ADC, SBC)
│   ├── shift.py             # Neural SHIFT (ASL, LSR, ROL, ROR)
│   ├── logic.py             # Neural LOGIC (AND, ORA, EOR, BIT)
│   ├── incdec.py            # Neural INCDEC (INC, DEC)
│   ├── compare.py           # Neural COMPARE (CMP, CPX, CPY)
│   └── branch.py            # Neural BRANCH (all 8 conditionals)
├── training/
│   ├── __init__.py
│   ├── data.py              # Ground truth data generators
│   └── train_all.py         # Master training script
└── weights/
    ├── alu.pt               # Pretrained ALU (1.7MB)
    ├── shift.pt             # Pretrained SHIFT (418KB)
    ├── logic.pt             # Pretrained LOGIC (425KB)
    ├── incdec.pt            # Pretrained INCDEC (413KB)
    ├── compare.pt           # Pretrained COMPARE (696KB)
    └── branch.pt            # Pretrained BRANCH (15KB)

Lessons Learned

What Worked

Exhaustive training: Training on every possible input guarantees correctness
Specialized organs: Different encodings for different operation types
Soroban encoding: Makes carry visible for arithmetic operations
Dedicated flag heads: Separate prediction paths for each status flag
Heavy oversampling: Critical for rare cases (zeros, boundaries)

What Didn't Work Initially

Single binary encoding for ALU: Couldn't learn carry propagation
Shared flag prediction: Z-flag accuracy suffered
Balanced training data: Zero results were underrepresented
Small models: Needed bigger hidden dimensions for complex patterns

Key Insights

Neural networks CAN do exact computation with proper architecture and training
Encoding matters: The right representation makes learning possible
Exhaustive verification is essential: Random sampling misses edge cases
Specialized > General: Task-specific architectures outperform general ones

Conclusion

The Neural 6502 proves that neural networks can perform exact digital computation. Not approximately—exactly. Every single one of the 460,928 tested input combinations produces the mathematically correct output.

This isn't emulation. This isn't simulation. The neural network IS the CPU.

The weights are the logic. The inference is the computation.

"The neural network learned to be a CPU."

FilesExpand file tree

BUILD_LOG.md

Latest commit

History

BUILD_LOG.md

File metadata and controls

FLYNNCONCEIVABLE! Build Log

Project Summary

Milestone 1: ALU Organ

The Challenge

Architecture

Soroban Encoding

VGem's Critical Fixes

Training

Verification

Milestone 2: SHIFT Organ

Operations

Architecture

Training

Milestone 3: LOGIC Organ

Operations

Architecture

Training

Milestone 4: INCDEC Organ

Operations

Architecture

Training

Milestone 5: COMPARE Organ

Operations

Architecture

Training

Milestone 6: BRANCH Organ

Operations

Architecture

Training

Milestone 7: Full CPU Integration

Wiring

Test Results

Final Statistics

Neural Organs

Git History

File Structure

Lessons Learned

What Worked

What Didn't Work Initially

Key Insights

Conclusion