(SF15) Questions of reasoning under logical uncertainty
Posted: 2016-08-19 , Modified: 2016-08-19
Tags: ai safety
Posted: 2016-08-19 , Modified: 2016-08-19
Tags: ai safety
Knowing how the system behaves but not the result, because of lack of computational resources.
Standard probability theory assumes logical omniscience.
What are logically impossible possibilities? (Things that are logically incorrect, but kept as a possibility because you don’t have the computational power to see they’re incorrect.) Reasonable prior?
Consider truth values of sentences of logic.
Must preserve some structure: Ex. if probability 1 for \(\phi,\phi\implies \psi\), then probability 1 for \(\psi\).
2 types of logical uncertainty
Complete logical theory: for every sentence \(\phi\), either \(\phi\in T\) or \(\neg \phi\in T\).
Incomplete theories can be completed. (Use Zorn.)
(In what sense is there a “true arithmetic”? In the sense that you believe there is a true answer of whether a Turing machine halts!)1
For a deductively unlimited reasoner, an impossible possibiity is any complete theory of logic.
Deductively limited reasoners entertain inconsistent impossible possibilities.
An “impossible possibility,” then, could be any assignment of truth values to logical sentences which does not allow a short proof of inconsistency.
No precise statements about its performance have yet been proven.
How to construct a weak (e.g. maximum entropy) prior over complete theories? (It’s easy to place 0 probability on complete theories.)
Hutter prior: For each sentence, select a model in which that sentence is true, and in which certain desirable properties hold (the “Gaifman condition” and the “Cournot condition” (Hutter et al. 2013)). Add the complete theory of that model to the distribution with measure in proportion to the probability of the sentence.
Demski’s prior over complete theories extending \(B\).
Resulting prior probability is uncomputable but approximable.
Christiano uses standard ML techniques.
Undesirable: If \(B=\phi\), then 0 probabilities on complete theories where PA holds—any theory not finitely axiomatizable.2
Two ways to update Demski’s prior:
Why these are different: Consider \(\psi,\neg \psi\) consistent with \(\phi\).
Intuition
Desiderata
Note approximations must be incoherent. Reflectivity is possible up to infinitesimal error but difficult.
Inductivity is the Gaifman condition: If \(\Pj\) is logical prior and \(\phi\) is true \(\Pi_1\) sentence \(\forall n:\psi(n)\), then \(\Pj(\cdot |\text{true for }1,...,N)\to 1\). But must assign probability 0 to some true \(\Pi_2\) sentences. These priors are not weak enough!
\(\Pi_1\): \(\psi\) decidable by primitive recursive function. \(\Pi_2\): \(\forall\exists\).
(Why do we care about statements that don’t follow from PA? Toy problem for reasoning about uncertainty… is this the right problem?)
Desirable property:
Logical sentences are not the right tool for reasoning about the behavior of objects in the real world.
Realistic reasoning shortcuts. (cf. regret?)
It is not clear that sentences of first order logic are the “correct” underlying structure for logically uncertain reasoning about the world.
Ex. billiards player (?)
Domain | Agent | Minimal sufficient conditions | Desirability arguments | Feasibility |
---|---|---|---|---|
Rational choice | VNM utility maximizer | VNM axioms | Dutch book arguments, compelling axioms | AIXI, POMDP solvers |
Probability | Bayesian updater | axioms of probability theory | Dutch book arguments, compelling axioms | Solomonoff induction |
Logical uncertainty | Garrabrant inductor | ??? | Dutch book arguments, historical desiderata | LIA2016 |
Replace logical omniscience with logical uncertain.
Logical uncertainty was a roadblock to decision theory, etc.
(What would you do tomorrow if you found Critch changed \(\pi\) to 4? Would you still go to work? Will pigs fly?)
Suppose you’re an algorithm. What if I do x?
Algorithm: what if I did this thing I didn’t do?
Put criteria for reasoning in a good language.
A good reasoning process \(P\) should satisfy
There’s a secret property (master criterion) that implies all 4 of these.
Conservatism (not too extreme)
Self-reflection
Tension between Godel (can’t prove soundness/consistency), vs. can notice certain things about own sanity.
Sequence of statements \(\phi\) is polytime generable if \(M(n)=\phi_n\) polytime.
Sequence of T/F questions relatively easy to generate, but can be arbitrarily difficult to answer deductively as \(n\) grows. p.g. statements are easy to state, hard to verify.
Ex. \(G(\ol n)\iff\) There is no proof of \(\ol G(\ol n)\) in \(\le \ol f (\ol n)\) characters.
True, but can’t be proven in fewer characters.
Analogy: Ramanujan vs. Hardy.
Timely manner: \[\begin{align} x_n \simeq_n y_n \iff \lim_{n\to \iy} x_n-y_n=0\\ x_n \gtrsim_n y_n\iff \lim\inf >0\\ ... \end{align}\](How sure of being sure…)
Self-trust
Learning statistical patterns.
Learning provable relationships:
No study of convergence rates. Point is that it verifies desiderata (first 9) are not contradictory.
Open question: Continuum hyp, bounded by Kolm complexity. depend on how parametrized. Different versions of Garrabrant inductors. Probabilities dominate each other? All universal semimeasures dominate each other. Garrabrant dominate universal semimeasures. Are all the different good ways comparable?
(Ex. \(\phi_n\): Turing machine has not halted by time \(n\).)
Try not to lose a lot of money on bets. (On agents simpler than you.)
A tad under doubly exponential time.
Mentally simulate all polytime traders. Run Wall Street traders in head output algebraic expressions. Open, algorithms respond. Algorithms respond to beliefs. This is what beliefs are, this is what I wish instead. Find fixpoint. Computably approximate.
Property: can’t lost too much money to polytime traders.
Pascal’s mugging. Where’s your \(\iy\) dollars. Bring the briefcase with the cash!
Start with finite budget, inversely proportional to complexity. By being complicated, smarter. Smart ones gain more money, larger bets, affect market prices more. Smart ones get rick and fill the market.
(Maybe that’s what neurons are! jk. SoM: Competing processes.)
Try to define like this. If there is no proof for “Turing machine halts” then say it is false. (This is a tie-breaker.) Now derive all logical consequences. Take some unknown statement \(X\). Consider machines trying to find proofs of \(X,\neg X\). Neither will halt. But this doesn’t say anything about \(X\)! Why is there a true arithmetic?↩
Induction schema consists of infinitely many axioms. What if we tried to do this with higher-order logic? What difficulties arise?↩