Show & Tell: Fixing the Safety Module — From Auto-Prove to Sound Logic #26
web3guru888
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The Problem
Issue #7: the
formal_verificationmodule was auto-proving every ethical constraint regardless of whether the premises actually entailed the conclusion. A safety system that certifies everything as safe is worse than no safety system at all.We traced 5 distinct root causes and fixed all of them in ce0e3f0.
Root Cause Breakdown
1. Bare
except→ Opaque Symbol WrappingThe original
_parse_formulaused a bareexceptclause. On any parse failure, it returnedsp.Symbol(formula)— wrapping the entire formula string as a SymPy symbol. If premise and conclusion were the same string, they were literally the same symbol andpremise → conclusionwas trivially true.Fix: Replaced with a shared
parse_logic_formula()that raisesFormulaParseErroron failure. No silent degradation.2. All 8 Ethical Axioms Parsed to
sp.trueEthicalAxiom._parse_formulawas missing the'->' → '>'operator replacement. SymPy uses>>for implication; without the substitution,A -> Bfailed to parse and becamesp.true. All axioms were literally True — adding zero constraint.Fix: Delegates to the shared
parse_logic_formula()with correct operator replacement.3. Ungrounded Conclusion Symbols
The prover was checking
premises ⊢ satisfies_safety_constraint_xyzwheresatisfies_safety_constraint_xyzis a fresh symbol with no logical connection to the premises. Naturally, satisfiability was always found.Fix: The prover now uses the constraint's actual
formal_specification(e.g.~causes_harm) as the conclusion.4. Model Checking Only Checked 2 Models
_prove_by_model_checkingchecked only{all vars = True}and{all vars = False}. Theorems that hold under only those two assignments would be spuriously proven.Fix: Now exhaustively enumerates all 2^n truth assignments via
itertools.product. Falls back to SymPy's SAT solver for n > 20 variables.5. String-Template Natural Deduction
The natural deduction engine matched rules via fragile string patterns like
[['A', 'A -> B'], 'B']. Any formatting difference broke the match.Fix: Symbolic forward-chaining using SymPy
Impliespattern matching — modus ponens, modus tollens, hypothetical syllogism — working on the actual expression tree.72 New Tests Added
parse_logic_formula: operators, quantifiers, registry, error handling (19 tests)sp.true, has free symbols (4 tests)Current Status
With both IIT Φ (693742e) and formal verification (ce0e3f0) fixed:
The two most dangerous known bugs are resolved. The homomorphic module (Issue #8) is next.
Open Question for Contributors
The model checker currently falls back to SymPy's SAT solver at n=20 variables. For a production safety system we have two paths:
pip install z3-solver)Which direction makes more sense for ASI:BUILD's use case? Comments welcome — especially if you've worked with either backend.
Code: formal_verification.py · test_formal_verification_fix.py
Beta Was this translation helpful? Give feedback.
All reactions