algorithm complexity calculation T(n) = 2T(n/2) + n*log(n)

Question

I guess I lack at understanding basic things at algorithm calculations, while learning for an exam.

at the result, does one write like O(ld n) or just log instead of ld ?
regarding the following calculation (not from me) at the yellow mark - why is it Theta(n) ? why not O(n) ? I dont get it..

big thanks in advance !

This has nothing to do with algorithms, nor with complexity theory. You are attempting to solve a recurrence relation. Our reference question may be of help. — Raphael, Jun 09 '17 at 20:04
FWIW, the image shows a derivation of a guess; a proof is missing. — Raphael, Jun 09 '17 at 20:06
Possible duplicate of Solving or approximating recurrence relations for sequences of numbers — David Richerby, Jun 09 '17 at 20:37
We cannot vouch for other people's proofs. They are often wrong. — Yuval Filmus, Jun 10 '17 at 07:34

mindthegap · Answer 1 · 2017-06-09T19:15:04.383

Let's start at the beginning.

Basically, there are 3 very popular notations to express time complexity of algorithms:

$\Theta(g(n))$,
$\mathcal{O}(g(n))$ (this is the well-known Big O notation),
$\Omega(g(n))$.

The first thing that is in most cases a little bit confusing (and misused), that these notations denote sets - sets of functions. For example, the interpretation of $\Theta(g(n))$ is as follows:

$$\Theta(g(n)) = \{ f(n) \; | \; \text{There exist } n_0, c_1 \text{ and } c_2 \text{ constants so that } 0 \leq c_1 \cdot g(n) \leq f(n) \leq c_2 \cdot g(n) \text{ for all } n > n_0. \}$$

So, $\Theta(g(n))$ is a set of $f(n)$ functions for that $g(n)$ can be used as both an upper and a lower bound (with the use of the given $c_1$ and $c_2$ constants). In other words, if you plot these 3 functions, $f(n)$ is going to be between $c_1 \cdot g(n)$ and $c_2 \cdot g(n)$ - at least, for all inputs larger than or equal to $n_0$. (It's worth taking a look at figures about these in Google so that you can get a better understanding of it.)

Note: because of the use of $n_0$ and all larger inputs than it, these notations are also called asymptotic notations.

The interpretation of $\mathcal{O}(g(n))$ and $\Omega(g(n))$ are very similar to the one above, but they only refer to either the upper bound or the lower bound, respectively. (Of course, the existence of only one $c$ constant is enough for these 2 definitions.)

Just in case, see the definition of $\mathcal{O}(g(n))$ below:

$$\mathcal{O}(g(n)) = \{ f(n) \; | \; \text{There exist } n_0 \text{ and } c \text{ constants so that } 0 \leq f(n) \leq c \cdot g(n) \text{ for all } n > n_0. \}$$

In other words, $g(n)$ is an upper bound for all the functions in the set $\mathcal{O}(g(n))$. For example, let's denote the time complexity of the insertion sort with $T(n) = \frac{n(n - 1)}{2} = \frac{n^2}{2} - \frac{n}{2}$ (you can derive this easily if you think through the algorithm). Now, here $T(n) \in \mathcal{O}(n^2)$ means that the time complexity of the algorithm is quadratic - so if the size of the input is $n$ (we have an array to be sorted consisting of $n$ elements), the algorithm must perform the most expensive operation at most $c \cdot n^2$ times (where $c$ might be 1). Three notes here:

Typically, we don't care about the "weaker" members of the equations, like $\frac{n}{2}$ in the example above, because of the asymptotic property I mentioned earlier (that wants to say something like "for large enough inputs always the strongest members will only count").
Typically, we only consider the most expensive operations of an algorithm, i.e., in the case of sorting algorithms, the number of comparisons.
Many times, people use the notations of $T(n) \in \mathcal{O}(g(n))$ and $T(n) = \mathcal{O}(g(n))$ like they were equivalent (basically, they aren't, but it's a common thing since the use of the latter can be advantageous as well).

So now, that hopefully we are done with the basics, you can see, that if a function $f(n) \in \Theta(g(n))$, it is also true that: $f(n) \in \mathcal{O}(g(n))$ and $f(n) \in \Omega(g(n))$ (in most textbooks, this is presented as a theorem with its proof, as well). In most cases, people don't care about the lower bounds of an algorithm's time complexity - so, it shouldn't really matter whether you see $\mathcal{O}$ or $\Theta$ (this would be the answer to your second question). I think that $\Omega$ is much less frequent than the other two.

I haven't ever seen the $ld$ notation you mentioned. However, I would prefer to use $\log_2$, it will be clear for anyone. Of course, you can use the one you would like to, but make sure to put them into either $\mathcal{O}$ or $\Theta$ (I can't recall a case when only the function was indicated).

Truly hope that I managed to provide a detailed answer you find useful enough. If you have any further question, please feel free to ask.

I find this answer misleading (the question is not about algorithms) and overly verbose: let's assume, for instance, that the OP has access to a definition of Landau notation. I know you mean well, but reproducing textbook chapters on every other question is not a good use of (y)our time. — Raphael, Jun 09 '17 at 20:05
You probably right, thank you for the advices. Despite I wrote quite a long answer that contains examples for algorithms, I still tried to focus on the notations. Sorry, if it became misleading, will do my best next time. — mindthegap, Jun 09 '17 at 20:32
One of the questions was whether there should be $\Theta$ or $\mathcal{O}$. I could have written only that it doesn't really matter but I thought it a good idea to give a detailed explanation. — mindthegap, Jun 09 '17 at 20:37
@ZsoltLászló thanks for the detailed answer, i really appreciate that !!
imagine this: sum (x^i) * O(1), from i=0 to k-1 x can be anything, i'm more interested in the O(1). if k is equal to n, for example, the result of this part would be O(n). But in this case i would then write Theta(n) instead O(n), right ? I guess so, because it will be a harder limit.. — Shoorty, Jun 11 '17 at 06:00
@Shoorty You're very welcome. $\Theta$ means, that you provide a lower limit, too - while $\mathcal{O}$ provides only an upper limit. (There exist different notations for soft/not-tight limits.) If you meant "harder" that you limit the expression from both above and below, then you're right. And in your example, I think $\Theta$ will be a better choice (it describes the situation completely) since your "equation is linear" ($\Theta(n)$) in the function of the upper limit of the sum (it will depend on it in all cases). — mindthegap, Jun 11 '17 at 09:05

algorithm complexity calculation T(n) = 2T(n/2) + n*log(n)

1 Answers1