Monthly Archive for May, 2008

Big O Notation

It is time for a primer on big o.
Big O

No, not that Big O. I am talking about the one we use in computer science. (If you don’t undestand the picture it is from an anime called The Big O)

What is big o notation?

Big O notation is a method used in computer science, and mathematics, to describe the efficiency of an algorithm. There are several other notations but big o is the most common for computer science. In short it is a function that describes the worst case performance of an algorithm. Big o means given n inputs the algorithm uses O(x), pronounced O of x, resources. x is a function that depends on n. Generally we are measuring operations, and thus speed, but it can also be used for memory usage or many other metrics.  Big o is useful because it tells us the algorithm will never be any worse than this. Little o tells you it can never be better than this and theta notation says it will be in this family, but those are beyond the scope of this post.

Figuring out big o

Actually proving some algorithm belongs to a certain set of big o families is not always easy but it is usually pretty simple to ballpark it. There are several commonly used types that we will cover.

Constant Time

Let’s look at a simple function, it just returns a value from an array given the index.  All the examples are going to be in C++, just a forewarning.

int getIndex( int Data[], int Index )
{
   return Data[Index];
}

This will take the same amount of time no matter what values we give it. We call this constant time or O(1). Now for a more interesting example.

Logarithmic Time

Logarithmic time is most often found where you partition the data into two even parts then continue working on one of them.  Binary search is a good example of this, it looks for a value in a sorted array.  It chooses a pivot value in the middle, determines if the value it is looking for is above or below the pivot then starts the search again using the new partition.

int binarySearch(int sortedArray[], int first, int last, int key) {
   while (first <= last) {
      int mid = (first + last) / 2; // compute mid point.
      if (key > sortedArray[mid])
         first = mid + 1; // repeat search in top half.
      else if (key < sortedArray[mid])
         last = mid – 1; // repeat search in bottom half.
      else
         return mid; // found it. return position /////
   }
return -(first + 1); // failed to find key
}

The proof showing that the relation between amount of work down and the number of inputs is not trivial so I will just say that these types of algorithms are classified as O(log n).  (Look up master theorem if you need more nitty gritty)  Technically it should be log base 2 n but few people complain if you just say log n.

Linear Time

Here is a function that takes an array of ints and prints them out.

void printInts( int Data[], int Size )
{
   for( int i = 0; i < Size; i++ )
   {
      std::cout << Data[i] << std::endl;
   }
}

We would say that Size is n because that is how many elements we have to look at. Since there is no way to get the job done without printing each of the n elements this is O(n) or linear time.

Polynomial

If you take a loop that is called n times and place a loop that does an additional n work what do you get? You get polynomial time.

void printAll( int Data[][], int SizeOne, int SizeTwo )
{
   for( int i = 0; i < SizeOne; i++ )
   {
      for( int j = 0; j < SizeTwo; j++ )
      {
         std::cout << Data[i][j] << ” “;
      }
      std::cout << std::endl;
   }
}

If we say SizeOne is n and SizeTwo is m, then we do n work, but each n work consists of m work as well. So we have to do n*m work. If n = m we would get n*n or n^2 work. Polynomial time is anything of the form n^m, where m doesn’t have to be an integer but it is greater than one. O(n^2) is referred to as quadratic because it appears often enough to get a special name.  This is written as O(n^2), O(n^3), etc.

Others

Exponential is O(c^n), where c is some some constant value. Examples that use this would be the traveling salesman problem, a notoriously hard problem in computing.

N to the n is O(n^n), something you should avoid writing if at all possible. These are slow. And by slow I mean the universe will end before this code finishes running.

In summary here are the most common forms and their names.
Big O Notation table

Why You Care

If each operation takes one second, then with n at 1 million this is how long it takes for each of the common ones to run.

Big O Timing

It should be obvious from the chart that a small change in the big o value of your algorithm can make a huge difference in computation time.  For most problems you can find something that is polynomial or better, log-linear being the best you can get for many problems.

Big o is also the worst case performance.  So getting O(n^2) may only occur when all the ducks are in a row, average case may be O(n log n) which is much faster for large n.  Another thing to consider is while many algorithms may have the same big o implementations affect performance as well.  So for small data sets a lean O(n^2) algorithm may out perform a O(n log n) one that has large amounts of overhead.  All things to keep in mind when you are writing code.

Iron Man (no spoilers)

I went and saw Iron Man tonight.  Go see it if you are a fan of any of the following: iron man (the character), comic book movies, robots, action, comedy or just great films.  I won’t write a review that gives anything away for a few weeks to give everyone time to go see it.  I loved it, great film.  Oh and stay for the full credits there is a scene after that is worth watching if you already sat through the movie.  And judging by the previews this is going to be a good year for movies.

Wordpress Themes and new look

If anyone out there checks back periodically you will probably notice the theme changing at fairly random intervals. That is because I haven’t been able to find a look that really fits what I feel this blog is all about. My Movable Type blog looked great and I am struggling to find something similar in Wordpress. Mostly I am being very picky. I can’t stand to read large blocks of black text against a white background and wouldn’t force any of you to do so. After an hour my retinas feel like they have third degree burns.

So I need a theme that has some grey, but not too dark. Black background is kind of cliched as emo and/or noob blog. I like blue but most of the themes are way too blue. And so forth. The ones that do match my color scheme have other quirks though. Like not showing the entry title or author. I know my name is on the blog title and only I post here, but that may not always be the case. I like to make sure everyone gets credit for their own work. So keep watching as I cycle through the vast ocean of blog themes in a futile effort to silence my inner criticism.

*edit* I settled on a theme called  Smooth Dirt.  The header image was originally a big ad for the guy who made but since his credits are at the bottom I didn’t feel bad modifying it to better fit my needs.  We shall see how it works out, I may spontaneously change my mind at a later date.