Tuesday, May 24, 2016

Big O notation and visualization


Most of time algorithm performance is measured by time complexity . Space complexity is also important but it is compromised most of time . If space required is not much compared to modern hardware then it can be ignored .

In computer science when there are two or more ways or algorithms to solve the problem we need to choose best one which completes in less time .

Practical way is ,for same hardware ,  for same size , run algorithms , use e.g. System.nanoTime()  to measure time  then compare them together .  We might need to run experiment for all possible sizes .
Instead of writing all possible approaches and measuring time for actual scenarios is time consuming.

Is there any other way for such comparison  of algorithms ?
Yes there is which uses asymptotic notation called Big_O_notation .

Big O notation denoted maximum time taken by algorithm for given size of problem .
It is asymptotic (tangent at infinity) upper bound.

 f(n) = O(g(n)) and if there exists constant c and  n0 such that
f(n) <= c*g(n) for n >= n0

 f(n) ,  g(n) are functions over non-negative integers .

Big O notation is used for worst case analysis of algorithms .

Instead of calculating actual time , it measures growth of time with growth of the problem size .

Every time size of problem increases , time is bound to increase .

Example :  To find maximum element in array using scanning is O(n) means as size of array increases we will need more time to find the maximum element . Say maximum element is located at end . If array size is of 10 then it takes 10 mSecs to find the element . If array size if 10 million then time requires could be 100 secs which is 1.5 min . if array size is of 1 billion then it will be 1500 min which is too large .

If say 1 Billion array is circularly sorted and we use binary search for the problem which is O(log(n)) then time required will be 20 mSec which is very fast .

So if we have efficient algorithm then on even commodity machine we can give fast outputs.

To calculate  "Big O notation" we need to consider only higher order terms .

Example :

n^2 + 4n+3 is O(n^2)

nlog2(n) + 10n + 1 is O(nlog2(n))


Asymptotic running time by comparison n --> infinity

1 < log2(n) < n < nlog2(n) < n^2 < 2^n


Theta (asymptotically tight bound ) , Ohh (upper bound ) and Omega (lower bound) notation differences


Figure (a) : Theta notation consider theta function sandwiching actual function  by upper and lower bounds
Figure (b) : Oh notation consider it as equal or upper bound or worst case
Figure (c) : Omega notation is function which confines equal or lower bounds. consider it as best case where time required is less.

little Ohh notation is strictly above the function .
little Omega is strictly below the function .

e.g.  Insertion sort for already sorted array Theta is between n^2 and n so big ohh is n^2 and omega is n only.

Most of time you will check big ohh notation because we are interested in worst case performance .

Graph of T(n) = O(g(n))

 From left to right (Y axis to X)
O(2^n), O(n^2) , O(n*log2(n)) ,O(n) , O(sqrt(n)) ,  O(log2(n))

Slowest growth of time for given input is O(log2(n)) .




Legends

Y - time in secs to execute the algorithm
X - number of items as input to algorithm







No comments: