1

I've already asked this question on stack overflow, but guys have suggested me to ask my question here.

Let's consider classic big O notation definition (proof link):

The $O(f(n))$ is the set of all functions such that there exist positive constants $C$ and $n_0$ with $|g(n)| \leq C \cdot f(n)$ for all $n \leq n_0$


According to this definition it is legal to do the following (g1 and g2 are the functions that describe two algorithms complexity):

$$g_1(n) = 9999 * n^2 + n \in O(9999 \cdot n^2)$$ $$g2(n) = 5 * n^2 + N \in O(5 * n^2)$$

And it is also legal to note functions as:

g1(n) = 9999 * N^2 + N ∈ O(n^2)

g2(n) = 5 * N^2 + N ∈ O(n^2)

As you can see the first variant O(9999*N^2) vs (5*N^2) is much more precise and gives us clear vision which algorithm is faster. The second one does not show us anything.

The question is: why nobody use the first variant?

It is obvious that if functions have different orders of growth (for example n^2 vs n) we don't care about constants. But if not, then constants play very important role in understanding what function has lower growth rate.

My reasoning why nobody does not specify constants is the following:

  1. constants are not very representative, for example:

    void foo(int n) { int x = 1; for(int i = 0; i < n; i++) { x += 1; x = x << 1; } }

Complexity will be: T1(n) = 1 + (1 + 1) * n = 1 + 2 * n

`void boo(int n) {
   int x = 1;
   for(int i = 0; i < n; i++) {
     x += 1;
     x = x * 1;
   }
 }`

Complexity will be: T2(n) = 1 + (1 + 1) * n = 1 + 2 * n

So, as we can see both complexities are the same. The problem is that foo function will be executed faster that boo because of << operation.

We could give each operation it own weight, but it will be too complicated, so to make things simple we just said: no constants in Big O.

The question is: is my reasoning is correct or we have another reasons why we do not specify constants in Big O notation?

Stella Biderman
  • 767
  • 4
  • 19
No Name QA
  • 13
  • 3

1 Answers1

3

Evaluating these Big-O constants in real life situations is nearly impossible. Each computer has different hardware, and the precise cost of each operation (or even of only a specific type of operation) is dependent on many different factors such as:

  1. What kind of memory did we need to access?
  2. Did we have a cache hit/miss?
  3. Can these kinds of operations be calculated in Parallel bulks by the CPU?

And many many more.

Even in a more theoretical setting, examining the constants can be very cost-ineffective and complicated. To do so, we need to define a specific cost for each basic operation used by an algorithm instead of just treating them as units, and the complexity analysis for our algorithm will need to be a lot more detailed and careful when counting.

For me the best way to see why we ignore constants is to simply study different algorithms and computer architecture, and while doing so think about how much more work you will need to invest to be able to discuss constants accurately, and what little real-life benefit you will receive from it.

Dean Gurvitz
  • 542
  • 3
  • 16