Wednesday, January 27, 2010

Bayesian nets in RDF, and how to update them

I've banged my head on this for a couple of days and feel close to a solution. The graph looks like this.



Each random boolean variable gets a node, the causal relationship between them gets a node, and each variable gets a probability.
The math for updating probabilities is a little tricky, but in a fun and interesting way, so I enjoyed banging on that. At some point I'll tackle more involved cases where there aren't simply two random boolean variables, but that's the logistically simple case that exposes most of the concepts involved. Kinda like the Drosophila of Bayesian inference.

"""Let A and B be random boolean variables, with A being an
unobservable cause and B being an observable effect, and Pr(B|A) and
Pr(B|~A) are given. This might be the case if the situation has some
structure that dictates the conditional probabilities, like a 6.432
problem, or it just might be empirical.

From these we can compute probabilities for the four cases.
P11 = Pr(B^A) = Pr(A) Pr(B|A)
P10 = Pr(~B^A) = Pr(A) (1-Pr(B|A))
P01 = Pr(B^~A) = (1-Pr(A)) Pr(B|~A)
P00 = Pr(~B^~A) = (1-Pr(A)) (1-Pr(B|~A))

Treat (Pr(B|A), Pr(B|~A)) as one piece of information, and Pr(A) as a
separate independent piece of information. The first piece reflects
your beliefs about the machinery connecting A to B, and the second
reflects your beliefs about the likelihood of A, so these are
legitimately separate concerns.

If we observe that B=1, then we want to replace Pr(A) with our
previous estimate for Pr(A|B), which is given by our old numbers as
P11/Pr(A) = P11/(P11+P01), and this becomes our posterior probability
for A."""

class Link1:
    def __init__(self, PrBgivenA, PrBgivenNotA):
        self.PrBgivenA, self.PrBgivenNotA = PrBgivenA, PrBgivenNotA
    def updatePrA(self, PrA, bvalue):
        p11 = PrA * self.PrBgivenA
        p10 = PrA * (1. - self.PrBgivenA)
        p01 = (1. - PrA) * self.PrBgivenNotA
        p00 = (1. - PrA) * (1. - self.PrBgivenNotA)
        # print p11, p01, p10, p00
        assert 0.999 < p11 + p01 + p10 + p00 < 1.001
        assert PrA - 0.001 < p10 + p11 < PrA + 0.001
        if bvalue:
            # Pr(A|B)
            newPrA = p11 / (p11 + p01)
        else:
            # Pr(A|~B)
            newPrA = p10 / (p10 + p00)
        assert 0.0 <= newPrA <= 1.0
        return newPrA

link = Link1(0.6, 0.27)

PrA = 0.5
print PrA

for i in range(5):
    PrA = link.updatePrA(PrA, 0)
    print PrA

for i in range(10):
    PrA = link.updatePrA(PrA, 1)
    print PrA