Tag Archive for 'Code'

Confessions, consessions and depression

I give up.  After years of fighting everyone in my department I give up.  They win.  What am I talking about?  Which operating system to write code on.  At UAF you start of using Visual Studio, much to the dismay of many students.  Creating small single file projects with MSVS is like killing flies with a bazooka; it may get the job done but there is a lot of collateral damage.  And I am tired of picking up the pieces.

All the tools available, particularly the  ones related to my thesis, either “work” on Windows, theoretically work on Windows, or openly flame Windows.

So starting this weekend I am going to install Linux (Ubuntu most likely) and refresh myself in the ways of gcc.  Sigh.

Boost Asio Serial_Port Demo

So I’ve recently started playing with hardware more.  This necessitated learning how to communicate my C++ with the actual hardware.  The de facto method for this for decades has been the RS232 serial port.  But how to actually use this to talk with code?

In Unix you can directly talk to dev/ttyS0 or wherever the device is located.  Windows gives you some com stuff to talk to it.  But is there a way to do it cross platform?

Of course there is, otherwise I wouldn’t write about it.  Enter Boost, the pivotal C++ library.  In Boost::Asio (ASynchronous Input/Output) there exists a serial_port class.  Seems to be good enough, all that is left is to get it to work.  Easier said than done.  There are basically no examples and the doc’s aren’t very helpful (weren’t to me at least).

So here is a small program that covers most of what a serial_port class needs to do.  You set the parameters before the program starts (certain os’s have limits built in, like you must set the baud rate beforehand etc).  The hardware this talks to takes a baud rate of 19200, 8 bits of data, no parity, and 1 stop bit.  My test board only read input so the example doesn’t take input yet.  Check the comments for more details, the license is at the top of the file.

Due to the stupidity of Wordpress the code format kept getting eaten so you have to download the file instead.

Link to file: boost_serial_port_demo.cpp

Dreaming of clouds

Cloud computing is a somewhat nebulous term currently.  I interpret it to mean a way of writing code, it runs somewhere and just works.  You don’t care where the server is, what kind of database it uses or any of that.

IEEE has a wonderfully vague definition: “It is a paradigm in which information is permanently stored in servers on the Internet and cached temporarily on clients that include desktops, entertainment centers, table computers, notebooks, wall computers, handhelds, etc.”

Some examples that fit this definition are: Bittorrent, Facebook, Google Chrome, Grid Computing, Ruby on Rails; the list goes on.  That is sufficiently general as to be worthless.

One thing is certain, cloud computing is HOT.  It is flavor of the month of keywords.  People talk about using cloud computing for all kinds of things without knowing what it is ( it seems no one really knows either ).

But as it stands now, despite the continuous buzz, it will not work, because there is a fatal flaw.

Amazon has the Elastic Compute Cloud, Microsoft recently published they are planning a cloud based OS, Google has, well pretty much everything they do falls under cloud computing.  New companies every day say “I want a piece of that fat cloud pie!”  This will only accelerate the problem.  If the industry sees the problem soon enough, as I have, then cloud computing has a rich future.  If not it will join Web 2.0 in stagnation and, eventually, extinction. ( I know Web 2.0 is not dead but it certainly has a head start down the trail cloud computing is on. )

The problem stems from the fact that since no one knows what cloud computing is, everyone gets to form their own definition.  Definitions are only a symptom.  I feel the main goal of cloud computing is to write code and it will work.  You don’t care how, you don’t care where, it just works.  Really that is the goal of all code, I don’t really care that the file my program needs is located in My Documents in Windows or somewhere else in Linux and another place in Mac, I just want to open the damn file.

Stories help illustrate points, here is one about a devoloper named Dave.

Dave has a cool idea for a web app.  It has needs a database and some server space and Dave has no experience with either.  He hears Amazon has a cloud setup that allows him to just write code that works without knowing all the gory details.  Extra bonus: if his service takes off and gets thousands of users it will scale automatically.  Dave loves that and whats more he buys books from Amazon, so he knows they can be trusted.  Dave can’t sign up fast enough.  And for a time, things are good.  It does what he wants but isn’t everything he imagined.  One day while explaining his project to Joe at the water cooler, Joe nods and follows along.  He then says “Dave, that sounds good but all the problems you are having just don’t happen in Google’s cloud system, it really is a much better fit for what you are doing.”  This gets Dave thinking, he starts looking into it and Joe is right, it would be a better fit.  So he signs up for Google’s cloud computing to see the differences.  Lo and behold it really is perfect.  He decides to transfer immediately.  Now the problem happens: Dave has tons of fully functional code on Amazon’s Cloud and now it needs to work on Google’s Cloud.  The only way to do that is to re-write it from scratch.  Well not scratch per se, Dave knows everything it needs to do, he just needs to change all the API specific stuff out.  That could be easy or nigh impossible depending on a variety of factors.

This is called vendor lock in.  If you use one product for long enough your stuff tends to revolve around it.  This has the effect of cementing it into your foundation.  After a while the only way to remove it is to destroy the entire house and rebuild from scratch, something most people and companies will never do.

The solution to this is so simple it is scary.  Make a standard.  These are the functions that will be supported on all cloud server architectures come hell or high water.  They work the same no matter who is hosting it.  A vendor can make special non-standard functions and users can use them at their own portability peril but it doesn’t stop new ideas from happening.  Any sufficiently cool idea should be added to the standard.  This way Dave can take his code from Amazon to Google or Microsoft or any of the others and it will do what was promised: it will work.

Programming languages have been doing this for years.  They all come with a set of functionality ( how much depends on the language, compiler, etc ) and it works regardless of where you write that code.  C++ programmers can use the boost library but not everyone is guaranteed to have it, but int better damn well be declared and work the way I know it should in all my C++ compilers.

If cloud computing embraces a standard then it can save itself from death.  It would allow people to change vendors when it suited them, to not be locked in for all eternity.  More importantly it would allow the bad vendors to die, instead of leeching off of one group who is too entrenched to get rid of them.  Code portability is something that would elevate cloud computing to its place in the clouds, instead of wandering around buzz word marsh until it winds up rotting in forgotton bog.

Big O Notation

It is time for a primer on big o.
Big O

No, not that Big O. I am talking about the one we use in computer science. (If you don’t undestand the picture it is from an anime called The Big O)

What is big o notation?

Big O notation is a method used in computer science, and mathematics, to describe the efficiency of an algorithm. There are several other notations but big o is the most common for computer science. In short it is a function that describes the worst case performance of an algorithm. Big o means given n inputs the algorithm uses O(x), pronounced O of x, resources. x is a function that depends on n. Generally we are measuring operations, and thus speed, but it can also be used for memory usage or many other metrics.  Big o is useful because it tells us the algorithm will never be any worse than this. Little o tells you it can never be better than this and theta notation says it will be in this family, but those are beyond the scope of this post.

Figuring out big o

Actually proving some algorithm belongs to a certain set of big o families is not always easy but it is usually pretty simple to ballpark it. There are several commonly used types that we will cover.

Constant Time

Let’s look at a simple function, it just returns a value from an array given the index.  All the examples are going to be in C++, just a forewarning.

int getIndex( int Data[], int Index )
{
   return Data[Index];
}

This will take the same amount of time no matter what values we give it. We call this constant time or O(1). Now for a more interesting example.

Logarithmic Time

Logarithmic time is most often found where you partition the data into two even parts then continue working on one of them.  Binary search is a good example of this, it looks for a value in a sorted array.  It chooses a pivot value in the middle, determines if the value it is looking for is above or below the pivot then starts the search again using the new partition.

int binarySearch(int sortedArray[], int first, int last, int key) {
   while (first <= last) {
      int mid = (first + last) / 2; // compute mid point.
      if (key > sortedArray[mid])
         first = mid + 1; // repeat search in top half.
      else if (key < sortedArray[mid])
         last = mid – 1; // repeat search in bottom half.
      else
         return mid; // found it. return position /////
   }
return -(first + 1); // failed to find key
}

The proof showing that the relation between amount of work down and the number of inputs is not trivial so I will just say that these types of algorithms are classified as O(log n).  (Look up master theorem if you need more nitty gritty)  Technically it should be log base 2 n but few people complain if you just say log n.

Linear Time

Here is a function that takes an array of ints and prints them out.

void printInts( int Data[], int Size )
{
   for( int i = 0; i < Size; i++ )
   {
      std::cout << Data[i] << std::endl;
   }
}

We would say that Size is n because that is how many elements we have to look at. Since there is no way to get the job done without printing each of the n elements this is O(n) or linear time.

Polynomial

If you take a loop that is called n times and place a loop that does an additional n work what do you get? You get polynomial time.

void printAll( int Data[][], int SizeOne, int SizeTwo )
{
   for( int i = 0; i < SizeOne; i++ )
   {
      for( int j = 0; j < SizeTwo; j++ )
      {
         std::cout << Data[i][j] << ” “;
      }
      std::cout << std::endl;
   }
}

If we say SizeOne is n and SizeTwo is m, then we do n work, but each n work consists of m work as well. So we have to do n*m work. If n = m we would get n*n or n^2 work. Polynomial time is anything of the form n^m, where m doesn’t have to be an integer but it is greater than one. O(n^2) is referred to as quadratic because it appears often enough to get a special name.  This is written as O(n^2), O(n^3), etc.

Others

Exponential is O(c^n), where c is some some constant value. Examples that use this would be the traveling salesman problem, a notoriously hard problem in computing.

N to the n is O(n^n), something you should avoid writing if at all possible. These are slow. And by slow I mean the universe will end before this code finishes running.

In summary here are the most common forms and their names.
Big O Notation table

Why You Care

If each operation takes one second, then with n at 1 million this is how long it takes for each of the common ones to run.

Big O Timing

It should be obvious from the chart that a small change in the big o value of your algorithm can make a huge difference in computation time.  For most problems you can find something that is polynomial or better, log-linear being the best you can get for many problems.

Big o is also the worst case performance.  So getting O(n^2) may only occur when all the ducks are in a row, average case may be O(n log n) which is much faster for large n.  Another thing to consider is while many algorithms may have the same big o implementations affect performance as well.  So for small data sets a lean O(n^2) algorithm may out perform a O(n log n) one that has large amounts of overhead.  All things to keep in mind when you are writing code.