One thing I found out in my pursuit of quantitative finance is that being an extremely multi-dimensional discipline, there seem to be endless things to learn. Not only are there countless topics to catch up on, but those topics are highly varied in difficulties. Nothing seems to be too trivial to ignore.
But in such a highly sought-after discipline where current discussions surround the most advanced techniques (unsupervised learning and causal inference are nowadays the darlings in the field), there’s so much franticness to chase after the hype. The fever has inadvertently caused new learners to become anxious and overwhelmed in their learning.
My motivation for creating content for the Quant Simplified series is to provide beginners with some avenues to read and think about the more basic topics, in a manner as relatable as possible, to help bridge the gaps between beginners and the more advanced topics. It is also to add some intuitions to those topics and demystify those scary math formulae for everyone.
Going back to one of the more basic concepts, inspired by a post by Florian Campuzan on LinkedIn which I engaged in one of my sleepless nights, this post shall explore some basic root searching algorithms — the Bisection Method and the Newton-Raphson (aka Newton’s) Method.
Consider a general function where the dependent variable y is a function of an independent variable x, which we denote as y = f(x). The root solution is the value of x in which y = 0.
Root searching algorithms are a set of systematic rules, written in codes, that seek to solve for the root with a given function y.
A Number Guessing Game — Bisection Method
When I was much younger, I used to play a simple game with a group of friends. It’s not exactly fancy, but because it has some elements of luck and gambling (not sure if there’s a high correlation between the love of options and the love of gambling :-) ), it was rather entertaining.
The game works like this: one player will play the “banker” and write down a number on a paper, with no one else knowing of that number. This number is typically a whole number (i.e. no decimal) and is within a range of numbers, say between 1–100. The other players (could be one or more) would need to guess what that number is, and one will win if one made a spot-on guess. Otherwise, if the number is wrong, that number will become the new limit, replacing the upper or lower bound so long as the actual number is within the bounds, narrowing the range the next player needs to pick from and increasing the winning odds. In this game, the guessers are given a limited number of turns, and if no correct guess is made by the end of the round, the banker wins.
This is essentially how the Bisection Method works, with us the users being the “banker” and the algorithm being the “player”. The number that the player is seeking to find is known as the “root”, the solution to an equation.
The only differences here are:
- As bankers, we don’t know what this number is, we only know the function that we need to solve for
- The upper and lower bounds will need to be carefully thought through and provided, and the algorithm will always guess the center value within the boundaries until it finds the solution
- The banker is not guessing a whole number, the solution can have decimals
On point 1: one may think that if we already have the function, we could solve for the root analytically and not go through such a rigorous game. In some situations this is true; we would not want to use a root searching algorithm on a trivial problem like a quadratic equation. However, there are situations (which we will touch upon briefly later) that there is no analytical solution, i.e. formula, that could help us compute for the roots. Thus numerical approaches such as the Bisection Method are needed.
On point 2: In order to find a root solution, we need to provide a set of boundaries (X_lower and X_upper) that, with 100% certainty, encapsulate at least one root solution. The only way to ensure this is to choose the boundaries such that
Meaning that the outputs from the function given the upper limit x_upper and the lower limit x_lower are opposite in sign. This makes sense. If the output is given the upper limit X_upper is more than 0 and the output given the lower limit X_lower is less than 0, there would exist an X that gives an output that is zero.
One needs to note that oftentimes functions have multiple root solutions. This means that the initial boundaries defined matter. Different boundary conditions may give rise to different root solutions if there is more than one root to begin with. So, if we only use the Bisection method, multiple iterations would be required with different bounds to get the desired solutions.
On point 3: This is a crucial point to note and makes all the difference. Because a root solution may not necessarily be a whole number (and more likely than not the solution has decimals), there are infinite numbers to guess from within the boundaries. How does the algorithm deal with this?
Numerical solutions often do not provide an exact solution but are more of an approximated solution. We as users need to provide a tolerance in accuracy — that we are willing to forfeit to trade off for speed. If we look for pinpoint precision, we may be spending disproportionally more time for a marginal improvement in accuracy. Of course, with modern computers, we could afford to set a fairly strict tolerance, say one of a million (10 to the power of -6), without burning up the PC.
Unfortunately, the Bisection Method has its flaws. The most obvious is that it is algorithmically inefficient, and its time performance is not the best. It is important to have more algorithms in the toolkit.
Tuning the Radio — Newton’s Method
One thing that comes to my mind when I think of Newton’s method is the tuning for a radio channel. Well, it’s 2023, and I’m sure the younger crowd here does not know what a radio is, so here’s an image of one for reference. The below analogy also applies to tuning a guitar or zooming a camera lens.
Imagine you have a radio with a knot (the knot beside the word ‘TUNE’ on the above image) that allows you to tune to your favorite channel. In the beginning, you will start at a random position (let’s call this position X0), and you will get feedback from the radio. Usually, it will start with a fuzzy static noise. One would try to turn the knot in the direction that reduces the static noises. Sometimes one may overtune and the static noise may increase again. Then one would turn the knot in the other direction and repeat the process until the feedback from the radio is clear (or to be specific, sufficiently clear based on our audible range).
That’s essentially what the Newton-Raphson formula does mathematically. And you may ask, how so?
The Newton-Raphson Method can be described using the below equation
Essentially, the position of the knot is our dependent variable X. The sound we hear from the radio (i.e. the feedback) is f(X). Our desired channel is f(X_solution) which is equal to 0.
We start with an initial position of X0, and we get the initial feedback f(X0). depending on how loud the static noise is, i.e. how far f(X0) is away from 0, we will move in the direction where the static noise becomes softer, i.e. f(X1) is closer to 0 than f(X0). How much should we tune? It depends on 2 things: 1) If f(X0) is so far away from 0, i.e. the static noise is so loud, we will make a bigger turn on the knot. 2) If a small turn doesn’t reduce too much of the static sound, then we turn the knot much faster. Conversely, if a small turn reduces the static noise tremendously, we will turn the knot slower. This is described by f’(X) in the denominator, which parallels the change in static noise with respect to the turning of the knot.
The negative sign ensures that the algorithm always progresses in the direction of the root solution, i.e. against the direction of increasing errors. Similar to the Bisection method, we would need to apply some tolerances. This is equivalent to our audible range. Our end result is that we do not get a 100% static-free sound, but it is virtually static-free because it’s so marginal that we don’t get affected by it.
One drawback of Newton’s method is that it relies on the derivative of function f(X), i.e. f’(X). Sometimes, this cannot be easily gotten. So here comes a modification of the Newton’s method, namely the Secant Method:
This makes use of the first-order finite difference approximation for the first derivative of the function f(X), as shown below.
The reason why the Secant Method is brought up here is that, together with the Bisection Method (and another ingredient — the inverse quadratic interpolation), the three algorithms make up Brent’s method. Brent’s method is very commonly used due to its rapid convergence, amongst many other benefits, thanks to the synergies among the three building blocks.
Understanding the root of the problem
As addressed briefly earlier, the reason why we implement root search algorithms is that many problems do not have an analytical solution. Such problems are extremely common in finance.
For starters, the pricing formula for a coupon-paying bond is a prime example. The bond pricing formula expresses the price of the bond as a function of the yield-to-maturity, the coupon payout, and the time of payouts and maturity in a closed-form equation. However, for any bond with more than one payout in its lifetime, we could not solve for its yield-to-maturity because we cannot express it as a function of the other variables, i.e. inverting the bond pricing formula.
Another popular formula that cannot be inverted is the Black-Scholes-Merton equation. In option pricing, finding the implied volatility is extremely important as it is the only non-observable variable in the equation (therefore implied — you may refer to my first article of the series for more info on implied volatility). The root search algorithms come in handy for such situations.
Bigger than it seems
Apart from finding yield-to-maturity from a bond pricing formula and implied volatility in the Black-Scholes-Merton model, there are many other situations in finance that we will need numerical methods for. These algorithms are not difficult, but they can be a part of the building blocks of more efficient and advanced algorithms. It is important to build some intuitions for these algorithms as a first step to challenging the more sophisticated models.
I hope you find the analogies in this article helpful in simplifying the root search algorithms. While analogies help to build relatable intuitions, they are still far from perfect. There are some complexities that cannot be accurately explained through simplification. I highly recommend the book “A Primer for the Mathematics of Financial Engineering” by Dan Stefanica, which compiles the most important math for quantitative finance and provides sufficient depth for serious learners. Please do like and share this article if you find it helpful. Thank you!