Algorithm Analysis — Part 2: Orders of Growth, Asymptotic Notations and Case Analysis
This article is the second part of a two part series on algorithm analysis. On the last part, we built a bottom-up understanding of and how we analyze algorithms, why we do it that way and their time complexity. We ended that article with our finished mathematical model. Now, we are going to introduce the concept of orders of growth, which types of algorithms fall into each of the most famous orders and then we’re going to talk about growth of functions within the context of algorithm analysis (Big Oh, Omega and Theta notations). Lastly, we will talk about different case scenarios of algorithm performance such as worst case and best case.
If you have a certain notion of how to calculate algorithm’s complexity or if you just want to understand these topics I just mentioned, this article will certainly help you.
Our method of algorithm analysis deals with approximations, as they are simpler and good enough to analyze the behavior of algorithms with large inputs of data. Every time we analyze an algorithm, we get a function that represents its running time. As the study of algorithm analysis grew, computer scientists started to notice that most algorithms fall into certain orders of growth, i.e., most algorithms’ running times are all within a certain group of functions, each with a certain pattern that determines their order of growth.
It’s not important to memorize all of them right away since as your study of algorithm analysis goes on, they’ll come naturally to you. The most famous orders of growth are actually very few.
A constant algorithm would be a simple operation like adding two numbers together, performing an if check, assigning a value to variable. Our last example in the previous article was an algorithm that solved the 3-sum problem, which had a time complexity of cubic order. Other famous examples are the binary search algorithm, which has logarithmic complexity and the mergesort algorithm, which has a linearithmic complexity.
Memorizing these values it’s not that important. What is important is to realize that some time complexities define the boundaries between being able to solve large instances of a problem and not being able to. When our solution starts to have time complexities of quadratic order, then it’s not good enough for huge problems, as the running time would grow much, much faster than the input. A very famous example is the implementation of the Discrete Fourier Transform, which enabled technologies like the DVD, JPEG images and MRIs. An initial and simple algorithm had running time of n². Only after a new implementation, which had linearithmic complexity, were these technologies possible.
In this image we have two functions, with the red one being f(x) = x² and the black one being g(x) = x log(x). It’s clear that the linearithmic functions grows much, much slower, and therefore enables us to solve larger problems.
Asymptotic Notations and Case Analysis
The actual value of the running time of an algorithm depends, many times, on the type of the input. For example, if you’re searching for an element inside an array; and to do so you iterate through the whole array, you’re going to take less time if the first element is the one that you’re looking for. On the other hand, if the desired element is at the last position or not in the array at all, it takes longer. In the first case, our algorithm took constant time, since it only did basic operations. With the target at the end, our algorithm’s running time grew proportionally to the size of the array (n).
This realization motivated computer scientists to analyze algorithm’s differently and to stablish certain case scenarios to categorize an algorithm. The first is the best case scenario, which is pretty self-explanatory. This the case at which the input data provides the best performance for an algorithm. For our search algorithm, the best case would be if the desired element was at the first position. The second is the worst case scenario, which defines the worst possible runtime for our algorithm. Again, with our search algorithm, this would happen if the element was at the last position or even if it wasn’t in the array at all. But, many times, neither of these cases happen. If looking for a target inside an array, there are many other ways for it to be within it than at the edges. When dealing with a regular random input, we are talking about the average case scenario.
Each of these cases provide us something. The best case defines a goal for all inputs. That’s where we would like to get with every single instance of our problem, even though that might not be possible. The worst case provides a guarantee for all inputs, since it can get any worse than that. That’s where we would like to get away from. And lastly, the average case provides us with a way to actually predict the performance of our algorithm, since most entries might fit here.
With this, computer scientists gain an even better understanding of a specific algorithm and might use these techniques to improve their design of new ones. By analyzing a problem and dividing it into worst, best and average cases, we have bounds that will define the performance of any algorithm that we create.
Here it’s where asymptotic notations come in. Big Oh, Omega and Theta notations are actually used to describe the behavior of functions and how they are bounded by other functions. But, since we can describe an algorithm’s running time as a function of its input, it fits nicely.
The first and most famous one is Big Oh, represented by the letter O. Putting it short and simple, if a function f is O(g), then g is an upper bound of f. Being more mathematically rigorous, we say that f is O(g) when there is a real and positive constant c and a number n0 such that:
This notation by itself is not really useful, because we can pick a particular function and find an infinite number of other functions that could be upper bounds. For example, if f(x) = x, then x itself is an upper bound, since we can use c=2 and n0 = 0.
When allied with the notion of worst and best cases, the Big Oh notation normally represents the worst case scenario of an algorithm. If we’re looping through an array to see if there is a 0 in it, then the worst case would be if the 0 is at the last position or not in the array at all. Therefore, this algorithm is O(n).
There is also the Ω (Omega) notation, which defines a lower bound for a function, and therefore, an algorithm. If a function f is Ω(g), then g is a lower bound of f. In mathematical terms, we say that f is Ω(g) when there is a real and positive constant c and a number n0 such that:
Similarly, this would represent the best case scenario for an algorithm. If we take the same example of searching for a 0 in an array, then the best case would happen when the 0 is at the first position, taking constant time to solve. Therefore, this algorithm is Ω(1).
The final notation is Θ(Theta) , which describes a tight bound for a function. It’s a combination of both previous notations. If f is Θ(g) , then f is O(g) and Ω(g) . This means that we can use g and alongside some other constants to tightly describe f ‘s behavior as it grows. We can also say that f is of the same order as g is. Mathematically speaking, f is Θ(g) if, and only if, there are constants c1, c2 and numbers n1 and n2 such that:
When applying the Theta notation to our study of algorithm analysis and case scenarios, we arrive at the notion of an optimal algorithm. An optimal algorithm solves a problem with the same time complexity at both the worst and best case. Its lower and upper bounds match. So, if we had an algorithm that is both O(log n) and Ω(log n), then we’d say that it is Θ(log n), and also that it is optimal.
We can use these notations to describe algorithms’ running times but we can also use them to describe complexities of problems in general. When trying to solve problems, computer scientists often try to first define the difficulty (stablish upper and lower bounds). By doing this, any solution (algorithm) will have its cost within these boundaries. Then, they try to see if they can make these bounds match, by either lowering the upper bound or raising the lower bound. By doing that, they prove that there is an optimal algorithm that solves that particular problem. This part of algorithm analysis has a lot of research, especially due to its immediate applications in our current technological world
In this article, we introduced orders of growth and saw which of them are desirable for solving problems. Then, we introduced case analysis for algorithms and saw how they are connected to asymptotic notations. Lastly, we painted a bigger picture for the use case of these notations when delimitating the boundaries of a problem.
With this, we reach the end of our algorithm analysis study. If you found any mistakes within my text, feel free to comment and I’ll correct them as soon as possible. I hope this helped you in some way.