Bayes Filters for Robotics
Incorporating Evidence Into Our Beliefs
At its core, Bayes’ Theorem is a mathematical method for updating your belief about the state of something given some evidence. For example, let’s say I asked you to guess what fruit I am thinking of. Your current belief incorporates possibilities for many different fruit such as apples, oranges, bananas e.t.c. Now if I tell you the thing is brown and furry on the outside how might this inform the probabilities you assign to different guesses? Well obviously its a lot less likely that I am thinking about an apple right now since apples are not typically brown and furry on the outside. Similarly, you would probably not guess that I am thinking of a cat or a dog just because I said the thing is brown and furry on the outside - you also know it should be a fruit! It should be clear that with each new piece of information, the number of “good” guesses (let’s define a “good” guess as having some likelihood above an arbitrary threshold) becomes smaller and the chance of the guess with the highest likelihood being correct becomes greater. In this post I hope to show how we can quantify this process.
I was thinking of a kiwi fruit by the way!
A picture of a kiwi fruit also known as a Chinese gooseberry
Bayes’ Theorem
I have to start by linking this amazing video by 3Blue1Brown from YouTube which does a fantastic job of explaining the maths and intuition behind Bayes Theorem intuitively and visually. Feel free to check it out below but nevertheless I will try to summarise it.
So to begin, let’s go over some terminology. In our previous example our hypothesis would be the guess that we are going to make. We want to know what the probability is that it’s correct. When I had only told you that I am thinking of a fruit this probability would have been quite low - many other fruit are more common and therefore its more likely that I would be thinking about them. Therefore, let’s say initially you assigned a probability that I am thinking of a kiwi fruit as only 0.005 or 0.5%. We will denote the probability of this hypothesis as
An image from 3Blue1Brown’s video on Bayes’ theorem
Next let’s incorporate the new information or evidence which was that the fruit is brown and furry on the outside. This changes our belief in two ways. Firstly, we can restrict the total possibilities to only include fruit that are brown and furry on the outside. Well kiwi fruit is furry but so are coconuts and maybe some other fruit too. Let’s say that overall only 0.75% of fruit are considered brown and furry on the outside. Mathematically, we can say the probability of this evidence is
Finally, we should consider the likelihood that I would give this piece of evidence given that I was, in fact, thinking of a kiwi fruit. I could have said it’s green on the inside or a bit acidic or any number of other things but if I was trying to help you guess correctly then I should give you the most unique descriptor I can of the fruit (we will assume this is the case here). Given this, let’s say that 99.5% of the time that I was thinking of a kiwi fruit I would have given this evidence. Mathematically we can say the probability that you received this evidence given that I was thinking of a kiwi fruit is
Putting this all together we can say that the 0.5% of the time I was thinking of a kiwi fruit and 99.5% of the time I was thinking of the kiwi fruit I would have said it was brown and furry on the outside. Therefore, the joint probability of these events (the probability that both occurred) is
or
In many cases instead of knowing the overall probability of the evidence we may instead know the likelihood of the evidence given the hypothesis is true and given it is not true. In the context of our example, we may instead know that 99.5% of the time I was thinking of a kiwi fruit I said it was brown and furry but and 0.2537% of the time I was not thinking of a kiwi fruit I would have also given this same information. Since this accounts for all possibilities when I gave this evidence, they add to the probability that I gave that evidence. That is,
Where
Applications in Robotics
While Bayesian filtering is applicable to many different fields in this article I will focus on how it can be used in robotics for state estimation. State estimation is important in robotics because processes can often be described using finite state machines or policies which map a robot’s state to an optimal action for it to take. Of course, for this to be effective, the robot’s state must be accurately estimated.
A typical example would be estimating a robot’s position given information from encoders on the wheels (encoders would measure the amount of rotation of the wheels) and some sensors that attempt to match the environment to a map. We would like to combine these pieces of information so that we can find the most likely position that we are in at a given time. This can be achieved in two steps prediction which gives us our prior and correction.
Prediction
The prediction is the first step and essentially provides us with an estimate of the new position based solely on the odometry information given the previous state. When applying Bayes theorem, this becomes our prior.
We might begin with some initial position for the robot
A diagram representing the odometry motion model
Unfortunately, the values provided by the encoders cannot perfectly represent the robot’s movement due to things like slip on the tyres or misaligned wheels or noise in the sensors. Taken together, we might assume that given the robot moved to some new pose
where
where
In this case,
where
Correction
Next we want to incorporate the information from our sensors so that we can improve our certainty in the position of the robot. The exact models used to do this based on Lidar scans or other things can be complicated. For the sake of simplicity I will make a few general points about this. Firstly, the likelihood of a given position given some sensor readings is itself an application of Bayes’ Theorem. That is because we only receive a measurement that depends on the real position of the robot. In other words, for a sensor reading
Assuming we have some model for
And we can actually keep going with this since now we have a new prior distribution. We can make another move and prediction and then update it based on the new measurements. This is known as recursive state estimation. Of course, to do this you would now need to consider the contributions from each possible starting state for a given hypothesis (now our starting position is not fixed and is itself a distribution). In some cases this is easier than it sounds and in the future I hope to write another post about how to achieve this using a Kalman filter.
Comments powered by Disqus.