# Exact Complexity of Mergesort, and an R Regression Oddity

It’s nice to be back after a pretty crazy two weeks or so.

Let me start off by stating that this blog post is simply me pondering and may not be correct. Feel free to comment on inaccuracies or improvements!

In preparation for an exam and my natural tendencies to be masochistic, I am forcing myself to find the exact complexities of some sorting algorithms and I decided to start with a favorite – mergesort. Mergesort divides an array or linked list first into two halves (or close to it) and then recursively divides the successive lists into halves until it ends up with two lists containing 1 element each – the base case. The elements are then compared and switched so that they are in order, and form their own list.

At successive levels we compare the last element of the first sublist to the first element of the second sublist and merge them together to form another list. This process continues up the recursion tree until the entire original list is sorted. For a more comprehensive and precise description, see this article on mergesort.

Easy: The Worst Case

The worst case is easy as any CS student will tell you. Looking at the recursion trees below, we see that in the worst case, we must perform $O(\log{n})$ “layers” of “parallel” merges corresponding to the height of the recursion tree, each merge performing $O(n)$ comparisons. Then, the worst case complexity of merge sort is $O(n \log{n})$.

Three Cases

Trivially, if the number of elements is a power of 2 (figure a), all of the sublists at each level of the recursion tree will have the same size. This forms a recursion tree that is full and balanced. In this case, we have the worst case complexity because each element is involved in exactly $\log_2{n}$ merge operations each of which take $O(n)$ time. In situation b the number of elements is 1 smaller than a power of 2 (i.e. 3, 7, 15). In situation c, the number of elements is 1 greater than a power of 2 (i.e. 3, 9, 17).

 (a) (b) (c)

In situation a, the number of merge operations required for each of the $n$ elements is $\log_2{n}$ which yields $O(n \log{n})$ operations.

In situations b and c, some elements require more merge operations than others; however, the number of merges differs by at most 1. The number of merges for each element is approximately $\log_2{n}$ and is exactly $\lfloor \log_2{n} \rfloor$ or $\lceil \log_2{n} \rceil$, then, the total number of work performed is equal to

$T(n) = a \lfloor \log_2{n} \rfloor + b \lceil \log_2{n} \rceil$

In situation a, we have

$T(n) = a \lfloor \log_2{n} \rfloor + b \lceil \log_2{n} \rceil = a \log_2{n}+ b \log_2{n} = \left( a + b \right) \log_2{n} = n \log_2{n}$

Then what? We need to find expressions for the constants $a$ and $b$. I will call them $a_n$ and $b_n$. I drew out several recursion trees and got the following table:

From the table, I extracted the following relationships. I will leave the proof by induction to the reader.

$b_n = \left\{ \begin{array}{ll} n - 2 & \mbox{if } n = 2^i + 1, i = 1, \ldots \\ \frac{n}{2} & \mbox{if } n = 2^i, i = 1, \ldots \\ b_{n-1} - 1 & \mbox{otherwise} \\ \end{array} \right.$

$a_n = \left\{ \begin{array}{ll} 2 & \mbox{if } n = 2^i + 1, i = 1, \ldots \\ \frac{n}{2} & \mbox{if } n = 2^i, i = 1, \ldots \\ a_{n-1} + 2 & \mbox{otherwise} \\ \end{array} \right.$

This is not quite exactly $n \log{n}$ as I have heard some people say, but it is pretty darn close ($O(n \log{n})$). Using the code below, I simulated several values for $a_n$ and $b_n$ and the corresponding plot for $T(n)$.

n <- seq(1,2000)
an <- bn <- ifelse(ceiling(log(n)/log(2))== floor(log(n)/log(2)),n/2,0)
i <- seq(1,floor(log(max(n))/log(2)))
bn[2**i + 1] <- n[2**i + 1] - 2
an[2**i + 1] <- 2
an[1] <- 0
bn[1] <- 0

for (i in 2:length(an)){
an[i] <- ifelse(an[i] == 0, 2 + an[i-1], an[i])
bn[i] <- ifelse(bn[i] == 0, bn[i-1] - 1, bn[i])
}

T <- an*ceiling(log(n)/log(2)) + bn*floor(log(n)/log(2))

plot(T~n, type= l ,main="Exact Runtime of Mergesort", xlab=expression(italic(n)), ylab=expression(italic(T(n))))